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PREFACE 


Since  public  health  surveillance  undergirds  public  health  practice,  it  is  unfortunate  that  no  single 
resource  has  been  available  to  provide  a  guide  to  the  underlying  principles  and  practice  of 
surveillance.  In  recent  years,  a  small  number  of  courses  on  surveillance  at  schools  of  public 
health  have  been  developed  in  recognition  of  the  importance  of  surveillance,  but  no  definitive 
textbook  has  appeared.  Principles  and  Practice  of  Public  Health  Surveillance   is  intended  to  serve 
as  a  desk  reference  for  those  actively  engaged  in  public  health  practice  and  as  a  text  for  students 
of  public  health. 

The  book  is  organized  around  the  science  of  surveillance,  i.e.,  the  basic  approaches  to  planning, 
organizing,  analyzing,  interpreting,  and  communicating  surveillance  information  in  the  context  of 
contemporary  society  and  public  health  practice.  Surveillance  provides  the  information  base  for 
public  health  decision  making.   It  must  continually  respond  to  the  need  for  new  information,  such  as 
about  chronic  diseases,  occupational  and  environmental  health,  injuries,  risk  factors,  and  emerging 
health  problems.  It  must  also  accommodate  to  changing  priorities.   Issues,  such  as  long  latency, 
migration,  low  frequencies,  and  the  need  for  local  data,  must  be  addressed.  New  analytic  methods 
and  rapidly  evolving  technologies  present  new  opportunities  and  create  new  demands.   This  book 
addresses  many  of  these  issues.  Although  many  examples  of  surveillance  systems  are  included,  this 
is  not  intended  to  be  a  manual  for  establishing  surveillance  for  any  particular  condition.   We 
believe  that  this  approach  will  provide  the  reader  with  ideas  and  concepts  that  can  be  adapted  to 
her  or  his  particular  needs. 

This  book  grew  out  of  a  recognition  by  the  Surveillance  Coordination  Group  at  the  Centers  for 
Disease  Control  of  the  need  to  capture  the  art  as  well  as  the  science  of  surveillance.  Most  of  the 
authors  are  current  or  former  staff  in  the  Epidemiology  Program  Office  at  the  Centers  for  Disease 
Control.  These  friends  and  colleagues  have  drawn  on  their  own  experience  in  surveillance  in  states, 
a  diversity  of  federal  programs,  and  in  international  health,  as  well  as  having  provided  an 
interweaving  of  the  experience  of  others.  We  felt  that  the  risks  of  being  parochial  were  outweighed 
by  the  desirability  of  producing  a  consistent  and  systematic  coverage  of  the  subject.   Although  most 


examples  are  drawn  from  the  United  States,  they  illustrate  basic  principles  and  approaches  that  can 
be  applied  in  a  wide  variety  of  settings  around  the  world. 

We  would  like  to  acknowledge  Douglas  Klaucke,  who  pulled  together  many  of  the  initial  thoughts  on 
organizing  the  book,  and  Stephen  Thacker,  the  Director  of  the  Epidemiology  Program  Office  (EPO) ,  and 
Donna  Stroup,  Director  of  the  Division  of  Surveillance  and  Analysis,  for  their  continued  support  and 
encouragement.  We  also  acknowledge  with  gratitude  the  creative  guidance  and  constructive  criticism 
provided  by  EPO's  Assistant  Director  for  Science,  Edwin  Kilbourne.  Finally,  and  most  importantly  of 
all,  we  gratefully  recognize  the  expertise,  the  dedication,  and  the  commitment  of  all  the  authors  in 
assuring  that  this  book  became  a  reality. 

SMT  Atlanta,  Georgia 

REC  August  1992 


Chapter  I 


Introduction 


Stephen  B.  Thacker 

"If  you  don't  know  where  you're  going,  any  road  will  get  you  there." 

Lewis  Carroll 

Public  health  surveillance  is  the  ongoing  systematic  collection,  analysis,  and 
interpretation  of  outcome-specific  data  for  use  in  the  planning,  implementation,  and 
evaluation  of  public  health  practice  (2) .   A  surveillance  system  includes  the 
functional  capacity  for  data  collection  and  analysis,  as  well  as  the  timely 
dissemination  of  these  data  to  persons  who  can  undertake  effective  prevention  and 
control  activities.   While  the  core  of  any  surveillance  system  is  the  collection, 
analysis,  and  dissemination  of  data,  the  process  can  be  only  understood  in  the  context 
of  specific  health  outcomes. 

BACKGROUND 

The  idea  of  observing,  recording,  and  collecting  facts,  analyzing  them  and  considering 
reasonable  courses  of  action  stems  from  Hippocrates  (2) .   The  first  real  public  health 
action  that  can  be  related  to  surveillance  probably  occurred  during  the  period  of 
Bubonic  plague,  when  public  health  authorities  boarded  ships  in  the  port  near  the 
Republic  of  Venice  to  prevent  persons  ill  with  plague-like  illness  from  disembarking 
(3) .  Before  a  large-scale  organized  system  of  surveillance  could  be  developed, 
however,  certain  prerequisites  needed  to  be  fulfilled.   First,  there  had  to  be  some 
semblance  of  an  organized  health-care  system  in  a  stable  government;  in  the  Western 
world,  this  was  not  achieved  until  the  time  of  the  Roman  Empire.   Second,  a 
classification  system  for  disease  and  illness  had  to  be  established  and  accepted, 
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which  only  began  to  be  functional  in  the  17th  century  with  the  work  of  Sydenham. 
Finally,  adequate  measurement  methods  were  not  developed  until  that  time. 

Current  concepts  of  public  health  surveillance  evolve  from  public  health  activities 
developed  to  control  and  prevent  disease  in  the  community.   In  the  late  Middle  Ages, 
governments  in  Western  Europe  assumed  responsibilities  for  both  health  protection  and 
health  care  of  the  population  of  their  towns  and  cities  (4)  .      A  rudimentary  system  of 
monitoring  illness  led  to  regulations  against  polluting  streets  and  public  water, 
construction  for  burial  and  food  handling,  and  the  provision  of  some  types  of  care 
(5).   In  1766,  Johann  Peter  Frank  advocated  a  more  comprehensive  form  of  public  health 
surveillance  with  the  system  of  police  medicine  in  Germany.   It  covered  school  health, 
injury  prevention,  maternal  and  child  health,  and  public  water  and  sewage  (4).      In 
addition,  he  delineated  governmental  measures  to  protect  the  public's  health. 

The  roots  of  analysis  of  surveillance  data  can  also  be  traced  to  the  17th  century.   In 
the  1680s,  von  Leibnitz  called  for  the  establishment  of  a  health  council  and  the 
application  of  a  numerical  analysis  in  mortality  statistics  to  health  planning  (2) . 
About  the  same  time  in  London,  John  Graunt  published  a  book,  Natural  and  Political 
Observations  Made  Upon   the  Bills  of  Mortality,    in  which  he  attempted  to  define  the 
basic  laws  of  natality  and  mortality.   In  his  work,  Graunt  developed  some  fundamental 
principles  of  public  health  surveillance,  including  disease-specific  death  counts, 
death  rates,  and  the  concept  of  disease  patterns.   In  the  next  century,  Achenwall 
introduced  the  term  'statistics,"  and  over  the  next  several  decades  vital  statistics 
became  more  widespread  in  Europe.   Nearly  a  century  later,  in  1845,  Thurnam  published 
the  first  extensive  report  of  mental  health  statistics  in  London. 

Two  prominent  names  in  the  development  of  the  concepts  of  public  health  surveillance 
activities  are  Lemuel  Shattuck  and  William  Farr.  Shattuck's  1850  report  of  the 
Massachusetts  Sanitary  Commission  was  a  landmark  publication  that  related  death, 
infant  and  maternal  mortality,  and  communicable  diseases  to  living  conditions. 
Shattuck  recommended  a  decennial  census,  standardization  of  nomenclature  of  causes  of 
disease  and  death,  and  a  collection  of  health  data  by  age,  gender,  occupation, 
socioeconomic  level,  and  locality.  He  applied  these  concepts  to  program  activities  in 
immunization,  school  health,  smoking,  and  alcohol  abuse,  and  introduced  these  concepts 
into  the  teaching  of  preventive  medicine. 


William  Farr  (1807-1883)  is  recognized  as  one  of  the  founders  of  modern  concepts  of 
surveillance  (6).      As  superintendent  of  the  statistical  department  of  the  Registrar 
General's  office  of  England  and  Wales  from  1839  to  1879,  Farr  concentrated  his  efforts 
on  collecting  vital  statistics,  on  assembling  and  evaluating  those  data,  and  on 
reporting  both  to  responsible  health  authorities  and  to  the  general  public. 

In  the  United  States,  public  health  surveillance  has  focused  historically  on 
infectious  disease.  Basic  elements  of  surveillance  were  found  in  Rhode  Island  in 
1741,  when  the  colony  passed  an  act  requiring  tavern  keepers  to  report  contagious 
disease  among  their  patrons.  Two  years  later,  the  colony  passed  a  broader  law 
requiring  the  reporting  of  smallpox,  yellow  fever,  and  cholera  (7). 

National  disease  monitoring  activities  did  not  begin  in  the  United  States  until  1850 
when  mortality  statistics  based  on  death  registration  and  the  decennial  census  were 
first  published  by  the  Federal  Government  for  the  entire  United  States  (8). 
Systematic  reporting  of  disease  in  the  United  States  began  in  1874  when  the 
Massachusetts  State  Board  of  Health  instituted  a  voluntary  plan  for  weekly  reporting 
by  physicians  reporting  on  prevalent  diseases,  using  a  standard  postcard-reporting 
format  (9,10).      In  1878,  Congress  authorized  the  forerunner  of  the  Public  Health 
Service  (PHS)  to  collect  morbidity  data  for  use  in  quarantine  measures  against  such 
pestilential  diseases  as  cholera,  smallpox,  plague,  and  yellow  fever  (11)  . 

In  Europe,  compulsory  reporting  of  infectious  diseases  began  in  Italy  in  1881  and 
Great  Britain  in  1890.  In  1893,  Michigan  became  the  first  U.S.  jurisdiction  to 
require  the  reporting  of  specific  infectious  diseases.  Also  in  1893,  a  law  was 
enacted  to  provide  for  the  collection  of  information  each  week  from  state  and 
municipal  authorities  throughout  the  United  States  (12).     By  1901,  all  state  and 
municipal  laws  required  notification  (i.e.,  reporting)  to  local  authorities  of 
selected  communicable  diseases  such  as  smallpox,  tuberculosis,  and  cholera.   In  1914, 
PHS  personnel  were  appointed  as  collaborating  epidemiologists  to  serve  in  state  health 
departments  to  telegraph  weekly  disease  reports  to  the  PHS. 

In  the  United  States,  it  was  not  until  1925,  however,  following  markedly  increased 
reporting  associated  with  the  severe  poliomyelitis  epidemic  in  1916  and  the  influenza 
pandemic  in  1918-1919,  that  all  states  had  begun  participating  in  national  morbidity 


reporting  (13).   A  national  health  survey  of  U.S.  citizens  was  first  conducted  in 
1935.   After  a  1948  PHS  study  led  to  the  revision  of  morbidity  reporting  procedures, 
the  National  Office  of  Vital  Statistics  assumed  the  responsibility  for  morbidity 
reporting.  In  1949,  weekly  statistics  that  had  appeared  for  several  years  in  Public 
Health  Reports   began  being  published  by  the  National  Office  of  Vital  Statistics.   In 
1952,  mortality  data  were  added  to  the  publication  that  was  the  forerunner  of  the 
Morbidity  and  Mortality  Meekly  Report    (MMWR) .     As  of  1961,  the  responsibility  for  this 
publication  and  its  content  was  transferred  to  the  Communicable  Disease  Center  (now, 
Centers  for  Disease  Control  [CDC] ) . 

In  the  United  States,  the  authority  to  require  notification  of  cases  of  disease 
resides  in  the  respective  state  legislatures.   In  some  states,  authority  is  enumerated 
in  statutory  provisions;  in  other  states,  authority  to  require  reporting  has  been 
given  to  state  boards  of  health;  still  other  states  require  reports  both  under 
statutes  and  health  department  regulations.   Variation  among  states  also  exists  among 
conditions  and  diseases  to  be  reported,  time  frames  for  reporting,  agencies  to  receive 
reports,  persons  required  to  report,  and  conditions  under  which  reports  are  required 
(14)  . 

The  Conference  (now  Council)  of  State  and  Territorial  Epidemiologists  (CSTE)  was 
authorized  in  1951  by  its  parent  body,  the  Association  of  State  and  Territorial  Health 
Officials  to  determine  what  diseases  should  be  reported  by  states  to  the  Public  Health 
Service  and  to  develop  reporting  procedures.   CSTE  meets  annually,  and  in 
collaboration  with  CDC,  recommends  to  its  constituent  members  appropriate  changes  in 
morbidity  reporting  and  surveillance,  including  what  diseases  should  be  reported  to 
CDC  and  published  in  the  MMWR. 

DEVELOPMENT  OF  THE  CONCEPT  OF  SURVEILLANCE 

Until  1950,  the  term  "surveillance"  was  restricted  in  public  health  practice  to 
monitoring  contacts  of  persons  with  serious  communicable  diseases  such  as  smallpox,  to 
detect  early  symptoms  so  that  prompt  isolation  could  be  instituted  (15) .  The  critical 
demonstration  in  the  United  States  of  the  importance  of  a  broader,  population-based 
view  of  surveillance  was  made  following  the  Francis  Field  Trial  of  poliomyelitis 
vaccine  in  1955  (16,17).   Within  2  weeks  of  the  announcement  of  the  results  of  the 
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field  trial  and  initiation  of  a  nationwide  vaccination  program,  six  cases  of  paralytic 
poliomyelitis  were  reported  through  the  notifiable-disease  reporting  system  to  state 
and  local  health  departments;  this  surveillance  lead  to  an  epidemiologic 
investigation,  which  revealed  that  these  children  had  received  vaccine  produced  by  a 
single  manufacturer.   Intensive  surveillance  and  appropriate  epidemiologic 
investigations  by  federal,  state,  and  local  health  departments  found  141  vaccine- 
associated  cases  of  paralytic  disease,  80  of  which  represented  family  contacts  of 
vaccinees.   Daily  surveillance  reports  were  distributed  by  CDC  to  all  persons  involved 
in  these  investigations.  This  national  common-source  epidemic  was  ultimately  related 
to  a  particular  brand  of  vaccine  that  had  been  contaminated  with  live  poliovirus.   The 
Surgeon  General  requested  that  the  manufacturer  recall  all  outstanding  lots  of  vaccine 
and  directed  that  a  national  poliomyelitis  program  be  established  at  CDC.   Had  the 
surveillance  program  not  been  in  existence,  many  and  perhaps  all  vaccine  manufacturers 
would  have  ceased  production. 

In  1963,  Langmuir  limited  use  of  the  term  "surveillance"  to  the  collection,  analysis, 
and  dissemination  of  data  (18) .   This  construct  did  not  encompass  direct 
responsibility  for  control  activities.   In  1965,  the  Director  General  of  the  World 
Health  Organization  (WHO)  established  the  epidemiological  surveillance  unit  in  the 
Division  of  Communicable  Diseases  of  WHO  (19)  .      The  Division  Director,  Karel  Raska, 
defined  surveillance  much  more  broadly  than  Langmuir,  including  "the  epidemiological 
study  of  disease  as  a  dynamic  process."   In  the  case  of  malaria,  he  saw  epidemiologic 
surveillance  as  encompassing  control  and  prevention  activities.   Indeed,  the  WHO 
definition  of  malaria  surveillance  included  not  only  case  detection,  but  also 
obtaining  blood  films,  drug  treatment,  epidemiologic  investigation,  and  follow-up 
(20)  . 

In  1968,  the  21st  World  Health  Assembly  focused  on  national  and  global  surveillance  of 
communicable  diseases,  applying  the  term  to  the  diseases  themselves  rather  than  to  the 
monitoring  of  individuals  with  communicable  disease  (21)  .     Following  an  invitation 
from  the  Director  General  of  WHO  and  with  consultation  from  Raska,  Langmuir  developed 
a  working  paper  and  in  the  year  prior  to  the  Assembly  obtained  comments  from 
throughout  the  world  on  the  concepts  and  practices  advocated  in  the  paper.  At  the 
Assembly,  with  delegates  from  over  100  countries,  the  working  paper  was  endorsed,  and 
discussions  on  the  national  and  global  surveillance  of  communicable  disease  identified 
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three  main  features  of  surveillance  that  Langmuir  had  described  in  1963:   a)  the 
systematic  collection  of  pertinent  data,  b)  the  orderly  consolidation  and  evaluation 
of  these  data,  and  c)  the  prompt  dissemination  of  results  to  those  who  need  to  know-- 
particularly  those  in  position  to  take  action. 

The  1968  World  Health  Assembly  discussions  reflected  the  broadened  concepts  of 
•epidemiologic  surveillance"  and  addressed  the  application  of  the  concept  to  public 
health  problems  other  than  communicable  disease  (20)  .      In  addition,  epidemiologic 
surveillance  was  said  to  imply  "...the  responsibility  of  following  up  to  see  that 
effective  action  has  been  taken.' 

Since  that  time,  a  wide  variety  of  health  events,  such  as  childhood  lead  poisoning, 

leukemia,  congenital  malformations,  abortions,  injuries,  and  behavioral  risk  factors 

have  been  placed  under  surveillance.   In  1976,  recognition  of  the  breadth  of 

surveillance  activities  throughout  the  world  was  made  evident  by  the  fact  that  a 

special  issue  of  the  International  Journal   of  Epidemiology   was  devoted  to  surveillance 

(22)  . 

\ 

SURVEILLANCE  IN  PUBLIC  HEALTH  PRACTICE 

The  primary  function  of  the  application  of  the  term  "epidemiologic"  to  surveillance, 
which  first  appeared  in  the  1960s  associated  with  the  new  WHO  unit  of  that  name,  was 
to  distinguish  this  activity  from  other  forms  of  surveillance  (e.g.,  military 
intelligence)  and  to  reflect  its  broader  applications.  The  use  of  the  term 
"epidemiologic,"  however,  engenders  both  confusion  and  controversy.   In  1971,  Langmuir 
noted  that  some  epidemiologists  tended  to  equate  surveillance  with  epidemiology  in  its 
broadest  sense,  including  epidemiologic  investigations  and  research  (25) .   He  found 
this  "both  epidemiologically  and  administratively  unwise, "  favoring  a  description  of 
surveillance  as  "epidemiological  intelligence." 

What  are  the  boundaries  of  surveillance  practice?  Is  "epidemiologic"  an  appropriate 
modifier  of  surveillance  in  the  context  of  public  health  practice?  To  address  these 
questions,  we  must  first  examine  the  structure  of  public  health  practice.   One  can 
divide  public  health  practice  into  surveillance;  epidemiologic,  behavioral,  and 
laboratory  research;  service  (including  program  evaluation);  and  training. 


Surveillance  data  should  be  used  to  identify  research  and  service  needs,  which,  in 
turn,  help  to  define  training  needs.   Unless  data  are  provided  to  those  who  set  policy 
and  implement  programs,  their  use  is  limited  to  archives  and  academic  pursuits,  and 
the  material  is  therefore  appropriately  considered  to  be  health  information  rather 
than  surveillance  data.   However,  surveillance  does  not  encompass  epidemiologic 
research  or  service,  which  are  related  but  independent  public  health  activities  that 
may  or  may  not  be  based  on  surveillance.   Thus,  the  boundary  of  surveillance  practice 
excludes  actual  research  and  implementation  of  delivery  programs. 

Because  of  this  separation,  "epidemiologic"  cannot  accurately  be  used  to  modify 
surveillance  (1)  .  '  The  term  "public  health  surveillance"  describes  the  scope 
(surveillance)  and  indicates  the  context  in  which  it  occurs  (public  health) .   It  also 
obviates  the  need  to  accompany  any  use  of  the  term  "epidemiologic  surveillance"  with  a 
list  of  all  the  examples  this  term  does  not  cover.   Surveillance  is  correctly --and 
necessarily--a  component  of  public  health  practice,  and  should  continue  to  be 
recognized  as  such. 

PURPOSES  AND  USES  OF  PUBLIC  HEALTH  SURVEILLANCE  DATA 

Purposes 

Public  health  surveillance  data  are  used  to  assess  public  health  status,  define  public 
health  priorities,  evaluate  programs,  and  conduct  research.  Surveillance  data  tell 
the  health  officer  where  the  problems  are,  whom  they  affect,  and  where  programmatic 
and  prevention  activities  should  be  directed.  Such  data  can  also  be  used  to  help 
define  public  health  priorities  in  a  quantitative  manner  and  also  in  evaluations  of 
the  effectiveness  of  programmatic  activities.  Results  of  analysis  of  public  health 
surveillance  data  also  enable  researchers  to  identify  areas  of  interest  for  further 
investigation  (23) . 

The  analysis  of  surveillance  data  is,  in  principle,  quite  simple.   Data  are  examined 
by  measures  of  time,  place,  and  person.  The  routine  collection  of  information  about 
reported  cases  of  congenital  syphilis  in  the  United  States,  for  example,  reflects  not 
only  numbers  of  cases  (Figure  1.1),  geographic  distribution,  and  populations  affected, 
but  also  indicates  the  effects  of  crack  cocaine  use  and  changing  sexual  practices  over 
the  past  10  years.   The  examination  of  routinely  collected  data  show  where  rates  of 
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salmonellosis  by  county  in  New  Hampshire  and  in  three  contiguous  states.   Mapping 
these  data  illustrates  the  pattern  of  the  spread  of  disease  across  state  boundaries 
(Figure  1.2).   The  examination  of  death  certificates  for  data  on  homicide  identifies 
high-risk  groups  and  shows  that  the  problem  has  reached  epidemic  proportions  among 
young  adult  men  (Figure  1.3). 

USES 

The  uses  of  surveillance  are  shown  in  Table  1.1.   Portrayal  of  the  natural  history  of 
disease  can  be  illustrated  by  the  surveillance  of  malaria  rates  in  the  United  States 
since  1930  (Figure  1.4).   In  the  1940s,  malaria  was  still  an  endemic  health  problem  in 
the  southeastern  United  States  to  the  degree  that  persons  with  febrile  illness  were 
often  treated  for  malaria  until  further  tests  were  available.  After  the  Malaria 
Control  in  the  War  Areas  Program  led  to  the  virtual  elimination  of  endemic  malaria 
from  the  United  States,  rates  of  malaria  decreased  until  the  early  1950s,  when 
military  personnel  involved  in  the  conflict  in  Korea  returned  to  the  United  States 
with  malaria.  The  general  downward  trend  in  reported  cases  of  malaria  continued  into 
the  1960s  until,  once  again,  numbers  of  cases  of  malaria  rose,  this  time  among 
veterans  returning  from  the  war  in  Vietnam.   Since  that  time,  we  have  continued  to  see 
increases  in  numbers  of  reported  cases  of  malaria  involving  immigrant  populations,  as 
well  as  among  U.S.  citizens  traveling  abroad. 

Surveillance  data  can  be  used  also  to  detect  epidemics.   For  example,  during  the  swine 
influenza  immunization  program  in  1976,  a  surveillance  system  was  established  to 
detect  adverse  sequelae  related  to  the  program  (24) .   Working  with  state  and  local 
health  departments,  CDC  was  able  to  detect  an  epidemic  of  Guillain-Barr6  syndrome, 
which  rapidly  led  to  the  termination  of  a  program  in  which  40,000,000  U.S.  citizens 
had  been  vaccinated.   However,  most  epidemics  are  not  detected  by  such  analysis  of 
routinely  collected  data  but  are  identified  through  the  astuteness  and  alertness  of 
clinicians  and  public  health  officials  of  the  community.   From  a  pragmatic  point  of 
view,  the  key  point  is  that  when  someone  does  note  an  unusual  occurrence  in  the  health 
picture  of  a  community,  the  existence  of  organized  surveillance  efforts  in  the  health 
department  provides  the  infrastructure  for  conveying  information  to  facilitate  a 
timely  and  appropriate  response. 
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The  distribution  and  spread  of  disease  can  be  documented  from  surveillance  data,  as 
seen  in  the  county-specific  data  on  salmonellosis  (Figure  1.2).   U.S.  cancer  mortality 
statistics  have  also  been  mapped  at  the  county  level  to  identify  a  variety  of 
geographic  patterns  that  suggest  hypotheses  on  etiology  and  risk  (25)  .     Recognition  of 
such  clusters  can  lead  to  further  epidemiologic  or  laboratory  research,  sometimes 
using  individuals  identified  in  surveillance  as  subjects  in  epidemiologic  studies. 
The  association  between  the  periconceptual  use  of  multivitamins  by  women  and  the 
development  of  neural  tube  defects  by  their  children  was  documented  using  children 
identified  in  a  surveillance  system  for  congenital  malformations  (26)  . 

Surveillance  data  can  also  be  used  to  test  hypotheses.   For  example,  in  1978  the  U.S. 
Public  Health  Service  announced  a  measles  elimination  program  that  included  an  active 
effort  to  vaccinate  school-age  children.  Because  of  this  program  and  the  state  laws 
that  excluded  from  school  students  who  had  not  been  vaccinated,  CDC  anticipated  a 
change  in  the  age  pattern  of  persons  reported  to  have  measles.  Before  the  initiation 
of  the  program,  the  highest  reported  rates  of  measles  were  for  children  10-14  years  of 
age.  As  predicted,  almost  immediately  after  the  school  exclusion  policy  was 
implemented,  there  was  not  only  a  general  decrease  in  the  number  of  cases  but  also  a 
shift  in  peak  occurrence  from  school-age  to  preschool-age  children  (Figure  1.5).  By 
1979,  there  were  even  lower  levels  of  measles  incidence  and  altered  age-specific 
patterns. 

Surveillance  data  can  be  applied  in  evaluating  control  and  prevention  measures.  With 
routinely  collected  data,  one  can  examine --without  special  studies--the  effect  of  a 
health  policy.  For  example,  the  introduction  of  inactivated  poliovirus  vaccine  in  the 
United  States  in  the  1950s  was  followed  by  a  dramatic  decrease  in  the  number  of 
reported  number  of  cases  of  paralytic  poliomyelitis,  and  the  subsequent  introduction 
in  the  1960s  of  oral  poliovirus  vaccine  was  followed  by  an  even  greater  decline 
(Figure  1.6) . 

Efforts  to  monitor  changes  in  infectious  agents  have  been  facilitated  by  the  use  of 
surveillance  data.   In  the  late  1970s,  antibiotic-resistant  gonorrhea  was  introduced 
into  the  United  States  from  Asia.  Laboratory-  and  clinical -practice-based 
surveillance  for  cases  of  gonorrhea  enabled  public  health  officials  to  monitor  the 
rapid  diffusion  of  various  strains  of  this  bacterium  nationally  and  facilitated 
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prevention  activities,  including  notifying  clinicians  of  proper  treatment  procedures 
(Figure  1.7).   Similarly,  the  National  Nosocomial  Infections  Surveillance  System,  a 
voluntary,  hospital -based  surveillance  system  of  hospital-acquired  infections,  has 
been  used  to  monitor  changes  in  antibiotic-resistance  patterns  of  infectious  agents 
associated  with  hospitalized  patients. 

As  noted  earlier,  the  first  use  of  surveillance  was  to  monitor  persons  with  a  view  of 
imposing  quarantine  as  necessary.  Although  this  use  of  surveillance  is  rare  in 
modern-day  United  States,  in  1975 — with  the  introduction  of  a  suspected  case  of  Lassa 
fever — over  500  potential  contacts  of  the  patient  were  monitored  daily  for  2  weeks  to 
assure  that  secondary  spread  of  this  serious  infectious  agent  did  not  occur  (27) . 

Surveillance  data  can  also  be  used  to  good  effect  for  detecting  changes  in  health 
practice.  The  increasing  use  of  various  technologies  in  health  care  has  come  to  be  an 
issue  of  growing  concern  over  the  past  decade;  surveillance  data  can  provide  useful 
information  in  this  area  (28)  .     For  example,  in  the  United  States  since  1965,  the  rate 
of  cesarean  delivery  has  increased  from  approximately  <5%  to  nearly  25%  of  all 
deliveries  (Figure  1.8).   Data  such  as  these  are  useful  both  in  planning  research  to 
learn  the  causes  of  these  changes  and  in  monitoring  the  impact  of  such  changes  in 
practice  and  procedure  on  outcomes  and  costs  associated  with  health  care. 

Finally,  surveillance  data  are  useful  for  planning.  With  knowledge  about  changes  in 
the  population  structure  or  in  the  nature  of  conditions  that  might  affect  a 
population,  officials  can,  with  more  confidence,  plan  for  optimizing  available 
resources.   For  example,  data  on  refugees  entering  the  United  States  from  Southeast 
Asia  in  the  early  1980s  were  broadly  applicable;  they  told  where  people  settled, 
described  the  age  and  gender  structure  of  the  population,  and  identified  health 
problems  that  might  be  expected  in  that  population.  With  this  information,  health 
officials  were  able  to  plan  more  effectively  the  appropriate  health  services  and 
preventive  activities  for  this  new  population. 

THE  FUTURE  OF  PUBLIC  HEALTH  SURVEILLANCE 

As  we  approach  the  year  2000,  several  activities  are  expected  to  contribute  to  the 
evolution  of  public  health  surveillance.  First,  use  of  the  computer- -particularly  the 
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microcomputer- -has  revolutionized  the  practice  of  public  health  surveillance.   In  the 
United  States,  the  National  Electronic  Telecommunications  System  for  Surveillance 
(NETSS)  links  all  state  health  departments  by  computer  for  the  routine  collection, 
analysis,  and  dissemination  of  data  on  notifiable  health  conditions  (29)  .      Over  the 
next  several  years,  the  growth  will  be  within  states,  with  state  health  departments 
being  linked  to  county  departments,  and  possibly  even  to  health-care  providers' 
offices  for  routine  surveillance.   The  Minitel  system  currently  in  use  in  France  has 
already  demonstrated  the  essential  utility  of  office-based  surveillance  of  various 
conditions  of  public  health  importance  (30)  . 

The  second  area  of  renewed  activity  associated  with  surveillance  is  that  of 
epidemiologic  and  statistical  analysis.   A  by-product  of  the  use  of  computers  is  the 
ability  to  make  more  effective  use  of  sophisticated  tools  to  detect  changes  in 
patterns  of  occurrence  of  health  problems.   In  the  1980s,  applications  and  methods  of 
time  series  analysis  and  other  techniques  have  enabled  us  to  provide  more  meaningful 
interpretation  of  data  collected  in  surveillance  efforts  (31)  .      More  sophisticated 
techniques  will  doubtless  continue  to  be  applied  in  the  area  of  public  health  as  they 
are  developed. 

Until  recently,  surveillance  data  were  traditionally  disseminated  as  written  documents 
published  periodically  by  government  agencies.   While  paper  reports  will  continue  to 
be  produced,  and  public  health  officials  will  continue  to  refine  the  use  of  print 
media,  they  are  also  beginning  to  use  electronic  media  for  the  dissemination  of 
surveillance  data.   More  effective  use  of  the  electronic  media,  and  all  the  other 
tools  of  communications,  should  facilitate  the  use  of  surveillance  data  for  public 
health  practice.  At  the  same  time,  ready  access  to  detailed  information  on 
individuals  will  continue  to  provide  ethical  and  legal  concerns  that  may  constrain 
access  to  data  of  potential  public  health  importance. 

The  1990s  will  see  surveillance  concepts  applied  to  new  areas  of  public  health 
practice  such  as  chronic  disease,  environmental  and  occupational  health,  and  injury 
control.  The  evolution  and  development  of  methods  for  these  programmatic  areas  will 
continue  to  be  a  major  challenge  in  public  health. 
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A  more  fundamental  principle  that  will  underlie  the  ongoing  development  of 
surveillance  is  the  increasing  ability  of  people  to  look  at  public  health  surveillance 
as  a  scientific  endeavor  {32) .      A  growing  appreciation  of  the  need  for  rigor  in 
surveillance  practice  will  no  doubt  improve  the  quality  of  surveillance  programs  and 
will  therefore  facilitate  the  analysis  and  use  of  surveillance  data.  An  important 
result  of  this  more  vigorous  approach  to  surveillance  practice  will  be  the  increased 
frequency  and  quality  of  the  evaluation  of  the  practice  of  surveillance  {33) . 

Finally,  and  probably  most  important,  is  the  observation  that  surveillance  needs  to  be 
used  more  consistently  and  thoughtfully  by  policymakers.   Epidemiologists  not  only 
need  to  improve  the  quality  of  their  analysis,  interpretation,  and  display  of  data  for 
public  health  use,  they  also  need  to  listen  to  persons  empowered  to  set  policy  in 
order  to  understand  what  stimulates  the  policymakers'  interest  and  action.  This 
assessment  allows  surveillance  information  to  be  crafted  so  that  it  is  presented  in 
its  most  useful  form  to  the  appropriate  audience  and  in  the  necessary  time  frame.   In 
turn,  as  we  maximize  the  utility  of  data  for  decision  making  and  better  understand 
what  is  essential  to  that  process,  we  will  raise  the  area  of  public  health 
surveillance  to  a  new  and  higher  level  of  importance. 

The  critical  challenge  in  public  health  surveillance  today,  however,  continues  to  be 
the  assurance  of  its  usefulness.   In  this  effort,  we  must  have  rigorous  evaluation  of 
public  health  surveillance  systems.   Even  more  basic  is  the  need  to  regard 
surveillance  as  a  scientific  endeavor.   To  do  this  properly,  one  must  fully  understand 
the  principles  of  surveillance  and  its  role  in  guiding  epidemiologic  research  and 
influencing  other  aspects  of  the  overall  mission  of  public  health.   Epidemiologic 
methods  based  on  public  health  surveillance  must  be  developed;  computer  technology  for 
efficient  data  collection,  analysis,  and  graphic  display  must  be  applied;  ethical  and 
legal  concerns  must  be  addressed  effectively;  the  use  of  surveillance  systems  must  be 
reassessed  on  a  routine  basis;  and  surveillance  principles  must  be  applied  to  emerging 
areas  of  public  health  practice. 
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Chapter   II 


Planning  a  Surveillance   System 


Steven  Teutsch 


"Natural  laws  govern  the  occurrence  of  a  disease,  that  these  laws  can  be  discovered  by 
epidemiologic  inquiry  and  that,  when  discovered,  the  causes  of  epidemics  admit  to  a 
great  extent  of  remedy." 

William  Farr 

As  described  earlier,  public  health  surveillance  is  the  systematic  and  ongoing 
assessment  of  the  health  of  a  community,  including  the  timely  collection,  analysis, 
interpretation,  dissemination,  and  subsequent  use  of  data.   Surveillance  provides 
information  for  action,  information  with  a  purpose.   Surveillance  systems  evolve  in 
response  to  ever-changing  needs  of  society  in  general  and  of  the  public  health 
community  in  particular.   In  order  to  understand  and  meet  those  needs,  an  organized 
approach  to  planning,  developing,  implementing,  and  maintaining  surveillance  systems 
is  imperative.   In  the  sections  below,  approaches  to  the  planning  and  evaluation 
processes  to  be  presented  in  more  detail  elsewhere  in  this  book  are  discussed.  The 
steps  in  planning  a  system  are  shown  in  Table  II. 1. 
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OBJECTIVES  OF  A  SURVEILLANCE  SYSTEM 


Planning  a  surveillance  system  begins  with  a  clear  understanding  of  the  purpose  of 
surveillance,  i.e.,  the  answer  to  the  question:   "What  do  you  want  to  know?"   In  the 
context  of  public  health,  surveillance  may  be  established  to  meet  a  variety  of 
objectives,  including  assessment  of  public  health  status,  establishment  of  public 
health  priorities,  evaluation  of  programs,  and  conduct  of  research.   Surveillance  data 
can  be  used  in  all  of  the  following  ways: 

to  estimate  the  magnitude  of  a  health  problem  in  the 
population  at  risk 

to  understand  the  natural  history  of  a  disease  or  injury 

to  detect  outbreaks  or  epidemics 

to  document  the  distribution  and  spread  of  a  health  event 

to  test  hypotheses  about  etiology 

to  evaluate  control  strategies 

to  monitor  changes  in  infectious  agents 

to  monitor  isolation  activities 

to  detect  changes  in  health  practice 

to  identify  research  needs  and  facilitate  epidemiologic 
and  laboratory  research 

to  facilitate  planning 


Surveillance  is  inherently  outcome  oriented  and  focused  on  various  outcomes  associated 
with  health-related  events  or  their  immediate  antecedents.   These  include  the 
frequency  of  an  illness  or  injury,  usually  measured  in  terms  of  numbers  of  cases, 
incidence,  or  prevalence;  the  severity  of  the  condition,  measured  as  a  case-fatality 
ratio,  hospitalization  rate,  mortality  rate,  or  disability;  and  the  impact  of  the 
condition,  measured  in  terms  of  cost.   Where  risk  factors  or  specific  procedures  are 
incontrovertibly  linked  to  health  outcomes,  it  is  often  useful  to  measure  the  latter 
because  health  outcomes  often  more  frequent  (and  hence  more  precisely  ascertainable 
for  small  populations)  and  may  be  more  closely  linked  to  public  health  interventions. 
For  example,  mammography  with  suitable  follow-up  is  the  major  prevention  strategy  for 
reducing  mortality  associated  with  breast  cancer.  Assessment  of  the  level  of 
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utilization  of  mammography  by  women  can  be  regularly  monitored  and  should  be  a  more 
timely  indicator  of  the  impact  of  public  health  prevention  programs  than  measurement 
of  mortality  from  breast  cancer.    Surveillance  data  should  also  provide  basic 
information  on  the  utilization  of  mammography  services  by  age  and  race/ethnicity  of 
recipient,  allowing  better  targeting  of  prevention  efforts  on  the  population  sectors 
with  the  lowest  utilization.   In  addition,  over-utilization  by  some  parts  of  the 
population  (e.g.,  women  <35  years  of  age  who  do  not  have  other  risk  factors)  might 
stimulate  efforts  to  reduce  unnecessary  procedures. 

High-priority  health  events  should  clearly  be  under  surveillance.   However, 
determining  which  should  be  considered  high-priority  events  can  be  a  daunting  task. 
Both  quantitative  and  qualitative  approaches  can  be  used  in  a  selection  process.  Some 
quantitative  factors  are  shown  on  Table  II. 2.   In  addition,  criteria  based  on  a 
consensus  process  to  identify  high-priority  problems  may  identify  emerging  issues  or 
problems  that  might  otherwise  not  be  considered.  The  consensus  process  leading  to  the 
Year  2000  Health  Promotion  and  Disease  Prevention  Objectives  in  the  United  States  is 
an  example  of  a  mechanism  for  identifying  high-priority  conditions,  types  of  behavior, 
and  interventions  that  require  ongoing  monitoring  (2) . 

Because  public  health  surveillance  in  the  United  States  is  driven  by  the  public  health 
need  to  be  cognizant  of  diseases  and  injuries  in  the  community  and  to  respond 
appropriately,  surveillance  is  inherently  an  applied  science.  Therefore,  as 
surveillance  has  evolved,  it  is  generally  undertaken  only  when  there  is  reasonable 
expectation  that  control  measures  will  be  taken  as  appropriate.   For  many  conditions 
the  link  between  surveillance  and  action  is  obvious  (e.g.,  meningococcal  meningitis 
prophylaxis  for  contacts  of  patients  diagnosed  as  having  meningitis) .   For  emerging 
conditions,  such  as  eosinophilia-myalgia  syndrome,  there  is  a  compelling  public  health 
need  to  identify  cases  (delineate  the  magnitude  of  the  problem) ,  identify  the  mode  of 
spread,  and  take  appropriate  action. 

Surveillance  data  are  usually  augmented  by  additional  studies  to  determine  more 
precisely  the  causes,  natural  history,  predisposing  factors,  and  modes  of  transmission 
associated  with  the  health  problem.   Yet,  undertaking  surveillance  exclusively  for 
research  purposes  is  rarely  warranted.  Research  needs  are  often  better  served  by 
other,  more  precise  (and  often  more  costly)  methods  of  case  identification  (e.g., 
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registries),  which  facilitate  more  detailed  data  collection  and  tracking  of  cases. 
For  example,  registries  of  type  I  diabetes  may  have  value  for  surveillance,  but  are 
justified  primarily  because  they  fill  research  needs.   The  ongoing  public  health 
application  of  these  data  is  more  limited.   Scarce  public  health  resources  and  the 
efforts  of  health-care  providers  to  report  cases  need  to  be  focused  on  problems  for 
which  the  public  health  importance  and  the  need  for  public  health  action  can  be 
readily  recognized. 

A  primary  role  of  surveillance  is  the  assessment  of  the  overall  health  status  of  a 
community.   One  approach  to  this  issue  is  the  development  and  identification  of  a  set 
of  indicators  that  measure  major  components  of  health  status.   Such  a  set  has  been 
developed  in  the  United  States  to  be  used  at  a  national,  state,  and  local  level 
(2) .   Another  approach  is  to  examine  the  most  frequent,  severe,  costly,  and 
preventable  conditions  in  the  community  by  examining  most  frequent  causes  of  death, 
hospitalization,  injury,  disability,  infection,  work-site-associated  illness  and 
injury,  and  major  risk  factors  for  all  the  preceding  items.   This  information  can  be 
obtained  in  most  communities  in  terms  of  age,  race/ethnicity,  gender,  and  temporal 
trends.   Regular  assessments  of  the  information  can  form  the  basis  for  educating  the 
community  about  its  major  health  problems  and  for  identifying  specific  conditions  that 
merit  more  intensive  surveillance  and  intervention. 

The  specific  objective  and  purpose  of  the  surveillance  system  should  be  specified  and 
general  agreement  obtained. 

METHODS 

Once  the  purpose  of  and  need  for  a  surveillance  system  has  been  identified,  methods 
for  obtaining,  analyzing,  disseminating,  and  using  the  information  should  fee 
determined  and  implemented  (see  Chapters  V,  VI,  and  VII) . 

Because  surveillance  systems  are  ongoing  and  require  the  cooperation  of  many 
individuals,  careful  consideration  must  be  given  to  the  attributes  discussed  in 
Chapter  VIII  in  the  discussion  on  evaluation.   The  system  adopted  must  be  feasible  and 
acceptable  to  those  who  will  contribute  to  its  success;  it  must  be  sensitive  enough  to 
provide  the  information  required  to  do  the  job  at  hand,  while  having  a  high 
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predictive-value  positive  to  minimize  the  expenditure  of  resources  on  following  up 
false-positive  cases.  A  surveillance  system  should  be  flexible  enough  to  meet  the 
continually  evolving  needs  of  the  community  and  to  accommodate  changes  in  patterns  of 
disease  and  injury.  It  must  provide  information  that  is  timely  enough  to  be  acted 
upon.  All  of  these  considerations  must  be  carefully  balanced  in  order  to  design  a 
system  that  can  successfully  meet  identified  needs  without  becoming  excessively  costly 
or  burdensome . 

Case  Definitions 

Practical  epidemiology  is  heavily  dependent  on  clear  case  definitions  that  include 
criteria  for  person,  place,  and  time  and  that  are  potentially  categorized  by  the 
degree  of  certainty  regarding  diagnosis  as  "suspected"  or  "confirmed"  cases  (3) . 

While  high  sensitivity  and  specificity  are  both  desirable,  generally  one  comes  at  the 
expense  of  the  other.  A  balance  must  be  struck  between  the  desire  for  high 
sensitivity  and  level  of  effort  required  to  track  down  false-positive  cases.   In 
addition,  case  definitions  evolve  over  time.   During  periods  of  outbreaks,  cases 
epidemiologically  linked  to  the  outbreak  cases  may  be  accepted  as  cases,  whereas  in 
non-epidemic  periods,  serologic  or  other  more  specific  information  may  be  required. 
Similarly,  when  active  surveillance  is  used,  such  as  in  measles  control  programs, 
numbers  of  cases  identified  tend  to  rise. 

As  our  understanding  of  a  disease  and  its  associated  laboratory  testing  improves, 
alterations  in  case  definitions  often  lead  to  changes  in  sensitivity  and  specificity. 
As  new  systems  complement  old  ones  (e.g.,  as  a  morbidity  system  supplements  a 
mortality  system  for  injury  surveillance) ,  the  reported  frequency  and  patterns  of 
conditions  change.  These  changes  must  be  taken  into  account  in  analysis  and 
interpretation  of  secular  trends  in  the  frequency  of  reporting.   It  is  all  too  easy  to 
define  cases  of  various  conditions  with  such  different  criteria  that  it  is  difficult 
to  compare  the  essential  descriptors  of  person,  place,  or  time.   For  example,  in 
surveillance  of  diabetes,  one  could  determine  the  frequency  of  diabetes  from  surveys 
(self  reports  of  diabetes) ,  surveys  using  glucose  determination  (laboratory- 
confirmed)  ,  or  from  reviews  of  ambulatory  or  hospital  records  (physician-diagnoses) . 
Each  method  provides  a  different  perspective  on  the  problem.   Self  reports  are  subject 
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to  vagaries  of  recall  and  variation  in  interpretation  (patient  may  be  under  treatment, 
may  have  "a  touch  of  diabetes"  or  prediabetes,  or  may  have  a  history  of  gestational 
diabetes).  Glucose  determinations  allow  detection  of  previously  undiagnosed  diabetes. 
Medical  records  identify  only  patients  currently  receiving  medical  care. 

Case  definitions  should  be  specified  including  criteria  for  person,  place,  time, 
clinical  or  laboratory  diagnosis,  and  epidemiologic  features. 

Data  Collection 

Information  on  diseases,  injuries,  and  risk  factors  can  be  obtained  in  many  ways. 
Each  mechanism  has  characteristics  that  must  be  balanced  against  the  purpose  of  the 
system  (see  Chapter  III)  .  Timeliness  is  of  the  essence  for  frequently  fatal 
conditions  such  as  plague,  rabies,  or  meningococcal  meningitis.   Notifiable-disease 
systems  are  most  appropriate  for  such  potentially  catastrophic  conditions  with  high 
and  urgent  preventability  constraints.  Conversely,  detailed  information  on  influenza 
strains  or  Salmonella   serotypes  must  come  from  laboratory -based  systems.   Long-term 
mortality  patterns  are  available  through  vital  records  systems. 

Often,  existing  data  sets  can  provide  surveillance  data.   Such  sets  include  vital 
records,  administrative  systems,  and  risk-factor  or  health- interview  surveys.   Among 
administrative  systems,  hospital-discharge  data,  medical-management-information  and 
billing  systems,  police  records  for  violence,  and  school  records  for  disabilities  or 
injuries  among  children  can  all  provide  needed  data,   in  addition,  with  some 
modification,  an  existing  system  might  provide  needed  data  more  economically  or 
efficiently  than  a  newly  initiated  system. 

Existing  registries  or  surveys  may  collect  information  on  defined  populations.   To  the 
extent  that  the  condition  of  interest  is  uniformly  distributed,  the  population  under 
study  is  reasonably  representative,  and  the  information  collected  is  available  on  a 
timely  basis,  such  systems  can  be  valuable  data  sources.   Although  many  registries  are 
established  for  research  purposes,  they  often  provide  valuable  data  for  surveillance 
purposes.   In  particular,  cancer  registries  have  been  widely  used  (4) . 
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Sentinel  providers  can  also  constitute  a  network  for  collecting  data  on  common 
conditions,  such  as  influenza;  more  specialized  providers  can  provide  data  on  less 
common  conditions,  e.g.,  ophthalmologists  who  provide  information  on  treatment  of 
patients  for  diabetic  retinopathy. 

Standardization 

Data-collection  instruments  should  use  generally  recognized  and,  where  suitable, 
computerized  formats  for  each  data  element  to  facilitate  analysis  and  comparison  with 
data  collected  in  other  systems,  e.g.,  census  and  other  surveillance  data.  Careful 
consideration  should  be  given  to  using  identifiers.  Although  additional  assurances  of 
confidentiality  and  privacy  considerations  will  be  required,  the  ability  to  link  data 
to  other  systems,  such  as  through  the  National  Death  Index,  may  enhance  the  value  of 
the  system. 

Active  and  passive  systems 

Primary  surveillance-data-collection  systems  have  traditionally  been  classified  as 
passive  or  active.   For  example,  most  routine  notifiable-disease  surveillance  relies 
on  passive  reporting.   On  the  basis  of  a  published  list  of  conditions,  health-care 
providers  report  notifiable  diseases  on  a  case-by-case  basis  to  the  local  health 
department.  This  passive  system  has  the  advantage  of  being  simple  and  not  burdensome 
to  the  health  department,  but  it  is  limited  by  variability  and  incompleteness  in 
reporting.   Although  the  completeness  of  reporting  may  be  augmented  by  efforts  to 
publicize  the  importance  of  reporting  and  by  continued  feedback  to  communications 
media  representatives,  passive  reporting  systems  may  still  not  be  representative  and 
they  may  fail  to  identify  outbreaks.   To  obviate  these  problems,  more  active  systems 
are  often  used  for  conditions  of  particular  importance.  These  systems  involve  regular 
outreach  to  potential  reporters  to  stimulate  the  reporting  of  specific  diseases  or 
injuries.  Active  systems  can  validate  the  representativeness  of  passive  reports, 
assure  more  complete  reporting  of  conditions,  or  be  used  in  conjunction  with  specific 
epidemiologic  investigations.   Since  resources  are  often  limited,  active  systems  are 
often  used  for  brief  periods  for  discrete  purposes  such  as  during  the  measles 
elimination  efforts. 

Limited  surveillance  systems 
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Some  surveillance  efforts  may  not  require  ongoing  systems.   Surveillance  to  deal  with 
specific  problems  may  be  needed  to  address  problems  for  which  all  cases  must  be 
identified  in  order  to  assess  the  level  of  risk.   Such  programs  can  be  conducted  to 
resolve  specific  problems  and  then  be  terminated  (5) .   Similarly,  for  logistic  and 
economic  reasons,  it  may  not  be  feasible  to  mount  a  surveillance  system  across  large 
geographic  areas,  and  representative  populations  may  need  to  be  selected.   Sentinel 
providers  can  also  provide  information  on  common  conditions  or  conditions  of 
particular  interest  to  them. 

Field  testing 

The  careful  development  and  field  testing  of  surveillance  systems  and  procedures  is 
important  to  facilitate  the  implementation  of  feasible  systems  and  to  avoid  making 
changes  as  systems  are  implemented  on  a  broad  scale.   The  frustration  engendered  by  a 
new  and  poorly  executed  system  may  undermine  efforts  to  improve  or  use  existing 
systems  for  the  same  or  other  conditions.  As  new  surveillance  systems  or  new 
instruments  and  procedures  are  developed,  field  tests  of  their  feasibility  and 
acceptability  are  appropriate.   These  field-test  projects  can  demonstrate  how  readily 
the  information  can  be  obtained  and  can  detect  difficulties  in  data-collection 
procedures  or  in  the  content  of  specific  questions.  Analyses  of  this  test  information 
may  also  identify  problems  with  the  information  collected.  Model  surveillance  systems 
may  facilitate  the  examination  and  comparison  of  a  variety  of  approaches  that  would 
not  be  feasible  on  too  large  a  scale  and  may  identify  methods  suitable  for  other 
conditions  or  other  settings. 

The  data  to  be  collected  by  a  surveillance  system,  the  data  sources  and  collection 
methods,  and  the  procedures  for  handling  the  information  should  be  developed  and 
tested. 

Data  Analysis 

A  determination  of  the  appropriate  analytic  approach  to  data  should  be  an  integral 
part  of  the  planning  of  any  surveillance  system.   The  data  needed  to  address  the 
salient  questions  must  be  assessed  to  assure  that  the  data  source  or  collection 
process  is  adequate.   Analyses  may  prove  to  be  as  simple  as  an  ongoing  review  of  all 
cases  of  rare  but  potentially  devastating  illnesses,  such  as  plague.   For  most 
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conditions,  however,  an  assessment  of  the  crude  number  of  cases  and  rates  is  followed 
by  a  description  of  the  population  in  which  the  condition  occurs  (person),  where  the 
condition  occurs  (place) ,  and  the  period  over  which  the  condition  occurs  (time) . 
These  basic  analyses  require  decisions  as  to  the  kind  of  information  that  needs  to  be 
collected.  The  level  of  detail  required  varies  substantially  from  condition  to 
condition.   For  instance,  one  may  need  more  detailed  information  regarding  the 
population  that  is  not  receiving  prenatal  care  than  on  the  one  that  is  exposed  to 
meningococcal  disease,  because  the  nature  of  the  intervention  for  the  former  is  likely 
to  be  more  complex  and  require  an  understanding  of  socioeconomic  factors.   Similarly, 
how  one  will  collect  data  on  geographic  areas  may  depend  on  whether  the  data  will  be 
examined  at  the  county,  state,  or  census- tract  level. 

Most  contemporary  surveillance  systems  are  maintained  electronically.   The  types  of 
analyses  to  be  performed  and  the  size  of  the  data  bases  should  suggest  the  type  of 
hardware  and  software  needed  (see  Chapter  XI) .   As  personal  computers  become  more 
powerful,  the  capacity  of  data-storage  devices  continues  to  grow,  and  data-sharing 
systems  such  as  local-  and  wide-area  networks  become  more  widely  available,  more 
surveillance  systems  can  be  operated  on  personal  computers.   Software  to  meet  most 
basic  analytic  needs  for  surveillance,  including  mapping  and  graphing,  is  now  widely 
available.   The  analytic  approach  often  suggests  a  basic  set  of  analyses  that  are 
performed  on  a  regular  basis.   These  analyses  can  be  designed  early  in  the  development 
of  the  system  and  incorporated  into  an  automated  system,  which  can  then  be  run  by 
support  personnel . 

The  adequacy  of  the  data  system  and  processing  mechanisms  should  be  assured. 

Interpretation  and  Dissemination 

Data  must  be  analyzed  and  presented  in  a  compelling  manner  so  that  decision  makers  at 
all  levels  can  readily  see  and  understand  the  implications  of  the  information. 
Knowledge  of  the  characteristics  of  the  audiences  for  the  information  and  how  they 
might  use  it  may  dictate  any  of  a  variety  of  communications  systems.   Routine,  public 
access  to  the  data—consistent  with  privacy  constraints- -should  be  planned  for  and 
provided.  This  access  can  be  facilitated  with  various  electronic  media,  ranging  from 
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systems  with  structured-analysis  features  suitable  for  general  users  to  files  of  raw 
data  for  persons  who  can  do  special  or  more  detailed  analyses  themselves. 

The  primary  users  of  surveillance  information,  however,  are  public  health 
professionals  and  health-care  providers.   Information  directed  primarily  to  those 
individuals  should  include  the  analyses  and  interpretation  of  surveillance  results, 
along  with  recommendations  that  stem  from  the  surveillance  data.   Graphs  and  maps 
should  be  used  liberally  to  facilitate  rapid  review  and  comprehension  of  the  data. 
Communications  media  represent  a  valuable  secondary  audience  that  can  be  used  to 
amplify  the  messages  from  surveillance  information.   The  media  play  an  important  role 
in  presenting  and  reinforcing  health  messages.   Innovative  methods  for  presenting 
information  capitalizing  on  current  audiovisual  technology  should  be  explored  (see 
Chapter  VII) . 

Evaluation 

Planning,  like  surveillance  itself,  is  an  iterative  process  requiring  the  regular 
reassessment  of  objectives  and  methods  (see  Chapter  VIII) .   The  fundamental  question 
to  be  answered  in  evaluation  is  whether  the  purposes  of  the  surveillance  system  have 
been  met.   Did  the  system  generate  needed  answers  to  problems?  Was  the  information 
timely?  Was  it  useful  for  planners,  researchers,  health-care  providers,  and  public 
health  professionals?   How  was  the  information  used?  Was  it  indeed  worth  the  effort? 
Would  those  who  participated  in  the  system  wish  to  (be  willing  to)  continue  to  do  it? 
What  could  be  done  to  enhance  the  attributes  of  the  system  (timeliness,  simplicity, 
flexibility,  acceptability,  sensitivity,  predictive-value  positive,  and 
representativeness) ? 

Answers  to  these  questions  will  direct  subsequent  efforts  to  revise  the  system. 
Changes  might  be  minor  (e.g.,  the  addition  of  data  elements  to  existing  forms),  or 
major  (e.g.,  the  need  to  obtain  information  from  entirely  different  data  sources). 
For  example,  a  system  to  determine  utilization  of  mammography  might  be  based  on 
administrative  billing  systems.   Yet,  problems  with  reports  of  multiple  mammography 
examinations  for  the  same  individual  might  require  the  addition  of  unique  patient 
identifiers  or  the  addition  of  questions  on  mammography  use  from  self  reports  on 
health-interview  surveys.   If  access  emerges  as  a  critical  factor  in  mammography 
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utilization,  then  ongoing  monitoring  of  the  quantity  and  location  of  mammography 
facilities  or  monitoring  for  appropriate  insurance  coverage  for  mammography  might  be 
indicated. 

Periodic  rigorous  evaluation  assures  that  surveillance  systems  remain  vibrant. 
Systems  that  assess  problems  whose  only  interest  is  historical  should  be  discontinued 
or  simplified  to  reduce  the  reporting  burden.   Contemporary  systems  should  take 
advantage  of  the  emergence  of  new  technology  for  information  collection,  analysis,  and 
dissemination.  They  should  capitalize  on  new  information  systems.  For  example, 
sentinel  surveillance  systems  have  become  more  flexible  to  allow  the  inclusion  of  an 
array  of  topics.  Electronic  medical  records  and  standardized  clinical  data  bases  all 
provide  opportunities  to  obtain  data  that  have  been  burdensome  or  difficult  to  secure 
(6).      These  information  sources  may  also  provide  data  in  a  more  timely  fashion  and 
may  allow  individuals  to  be  tracked,  an  option  that  would  be  virtually  impossible 
without  such  electronic  systems. 

INVOLVEMENT  OF  INTERESTED  PARTIES  IN  SURVEILLANCE 

Virtually  all  surveillance  systems  involve  networks  of  organizations  and  individuals. 
Surveillance  of  notifiable  disease  relies  on  health-care  providers  including 
clinicians,  hospitals,  and  laboratories  to  report  to  local  health  departments,  who 
have  the  initial  responsibility  for  responding  to  reports  and  amassing  data.   In  many 
states,  epidemiologists  in  the  state  health  departments  are  responsible  for 
surveillance  and  control  of  notifiable  diseases  in  their  states.   In  larger  states, 
other  organizational  units--such  as  those  dealing  with  sexually  transmitted  disease, 
immunization,  or  tuberculosis  control--often  have  primary  responsibility  for 
surveillance  and  control  of  specific  diseases  or  injuries.  The  state  epidemiologist 
is  responsible  for  the  ongoing  quality  control,  collection,  analysis,  interpretation, 
dissemination,  and  use  of  notifiable-disease  data  within  that  state.  Data  are 
subsequently  forwarded  each  week  to  the  national  level  where  they  are  again  analyzed, 
interpreted,  and  disseminated. 

Programs  for  injuries  and  chronic  and  environmental  diseases  also  may  have  complex 
organizational  structures  and  may  involve  a  wide  array  of  external  professional  and 
voluntary  interest  groups  whose  needs  must  be  addressed.   Some  basic  surveillance 
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information  can  be  gleaned  from  such  ongoing  information  systems  as  vital  records, 
hospitalization  programs,  and  registries.   Although  some  of  these  conditions  are  part 
of  state  notifiable-disease  lists,  many  require  surveillance  systems  to  be  established 
in  unique  places  (e.g.,  rehabilitation  units  and  emergency  medical  services  for 
spinal-cord  injuries  or  radiology  centers  for  mammography) .   The  support  and  interest 
of  these  groups  of  constituents  are  valuable  in  establishing  the  systems;  these  groups 
can  provide  key  input  regarding  purposes  of  systems  and  users  of  systems,  as  well  as 
assistance  in  developing  the  systems  themselves. 

The  complex  relationships  among  these  organizational  units  and  their  constituents 
requires  open  communication  to  establish  priorities  and  methods  consistent  with  the 
needs  and  resources  of  each  group.  The  conflicting  desire  for  more  detailed 
information  must  be  balanced  against  the  associated  burden  and  cost,  as  well  as 
against  the  utility  of  collecting  extensive  amounts  of  data.   For  example,  electronic 
systems  that  may  facilitate  higher  quality,  more  complete,  and  more  timely  data  also 
involve  the  commitment  of  equipment,  training,  and  changes  in  day-to-day  activities 
that  may  permeate  all  levels  of  the  system.   One  must  understand  the  needs  of  each 
recipient  group  for  the  information  and  assess  and  assure  their  commitment  to  the 
system.   It  is  also  critical  to  be  attentive  to  how  components  of  the  system  can  best 
be  integrated  into  the  overall  system  in  terms  of  day-to-day  operations. 

The  Council  of  State  and  Territorial  Epidemiologists,  an  affiliate  of  the  Association 
of  State  and  Territorial  Health  Officials,  has  the  authority  in  the  United  States  to 
recommend  which  health  conditions  should  be  notifiable.  After  this  list  has  been 
agreed  upon,  it  is  then  up  to  each  state  to  determine  whether  and  how  the  conditions 
should  be  made  reportable.  Although  most  states  report  all  those  conditions 
considered  to  be  nationally  notifiable,  a  wide  range  of  conditions  are  reportable  in 
only  a  few  states  (3)  .  States  may  exercise  their  authority  through  regulations, 
boards  of  health,  or  legislative  procedures.  The  diversity  of  these  methods  is 
described  more  fully  in  Chapter  XII.   Each  of  these  mechanisms  entails  the  involvement 
of  groups  with  an  array  of  medical,  administrative,  public  health,  and  policy 
interests. 

The  success  of  surveillance  depends  heavily  on  the  quality  of  the  information  entered 
into  the  system  and  on  the  value  of  the  information  to  its  intended  users.   A  clear 
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understanding  of  how  policy  makers,  voluntary  and  professional  groups,  researchers, 
and  others  might  use  surveillance  data  is  valuable  in  garnering  the  support  of  these 
audiences  for  the  surveillance  system. 
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"The  real  voyage  of  discovery  consists  not  in  seeing  new  landscapes  but  in  having  new 
eyes. ■ 

Marcel  Proust 


INTRODUCTION 

This  chapter  reviews  sources  of  routinely  collected  data  that  can  be  used  for  public 
health  surveillance.  In  many  instances,  these  sources  will  provide  sufficient 
information  so  that  active  case- finding  for  the  health  event  of  interest  many  not  be 
necessary.  In  other  instances,  analysis  of  routinely  collected  data,  in  conjunction 
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with  active  case- finding,  will  provide  the  basis  for  a  comprehensive  assessment  of  the 
public  health  impact  of  a  particular  health  event. 

For  infectious  diseases,  surveillance  activities  have  traditionally  relied  on 
"notifiable0  disease  reporting  systems  based  on  legally  mandated  reporting  of  cases  to 
health  officials.  Depending  on  characteristics  of  the  reporting  system  and  of  the 
specific  health  event,  these  systems  can  provide  timely  information  that  is 
particularly  useful  for  monitoring  short-term  trends  and  for  detecting  outbreaks  or 
epidemics  of  disease.  While  prevention  and  control  of  infectious  diseases  remains  a 
mainstay  of  public  health  practice,  there  is  increasing  emphasis  on  monitoring  the 
public  health  impact  of  non-infectious  or  chronic  diseases  and  injuries,  as  well  as 
risk  factors  for  these  conditions,  including  behavioral  risk  factors,  demographic 
characteristics,  and  potential  exposure  to  toxic  agents.  With  the  expansion  in  the 
number  and  type  of  health  events  under  surveillance,  the  use  of  existing  data  sources, 
such  as  vital  statistics  and  more  recently  hospital  discharge  data,  has  expanded;  and 
new  data  sources,  such  as  behavioral  risk  factor  surveys,  have  been  developed. 

This  chapter  describes  characteristics  of  six  types  of  health  information  systems  in 
which  data  are  collected  routinely  and  are  generally  available  for  analysis.  The  six 
are  notifiable  disease  and  related  reporting  systems,  vital  statistics,  sentinel 
surveillance,  registries,  health  surveys,  and  administrative  data  collection  systems. 
As  more  sources  of  health  information  become  available,  effective  surveillance  for  a 
specific  health  event,  whether  infectious  or  non- infectious,  will  rely  on  analysis  and 
synthesis  of  information  from  a  variety  of  sources,  each  of  which  has  different 
strengths  and  limitations.  In  many  instances,  these  sources  will  provide  sufficient 
information  so  that  active  case-finding  or  other  surveillance-related  activities  may 
not  be  necessary.  In  other  instances,  analysis  of  routinely  collected  data,  in 
conjunction  with  other  activities,  will  provide  the  basis  for  a  comprehensive 
assessment  of  the  public  health  impact  of  a  particular  health  event.  For  cervical 
cancer,  for  instance,  surveillance  activities  could  include  the  following: 
comprehensive  assessment  of  cancer  incidence  data  and  cancer  mortality  data;  reports 
of  cervical  cytology  and  genital  infections  by  laboratories;  reports  of  pap  smear 
histories,  smoking  patterns,  genital  infections  and  safe  sex  practices  from  health 
surveys;  review  of  hospital-discharge  data  to  monitor  surgical  treatment  for  advanced 
disease;  and  information  from  a  variety  of  sources  on  attitudes,  payment  strategies, 
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and  other  barriers  or  inducements  that  could  influence  the  prevention,  early 
detection,  and  treatment  of  cervical  cancer.  The  selection  and  appropriate  use  of  data 
from  these  sources  would  depend  primarily  on  the  nature  and  scope  of  activities  to  be 
monitored  as  part  of  a  cervical  cancer  control  program. 

Depending  on  the  health  event  of  interest,  special  short-term  or  demonstration 
projects  can  also  provide  information  that  is  very  useful  for  surveillance  or  other 
prevent ion- related  activities.  This  chapter,  however,  focuses  on  sources  of  data  in 
which  information  on  a  wide  range  of  health  events  is  collected  on  a  routine,  ongoing 
basis  and  is  generally  available  for  analysis. 

The  examples  provided  in  this  chapter  are  meant  to  be  illustrative  rather  than 
exhaustive.  Many  examples  are  research-  rather  than  surveillance-related,  but  they  do 
highlight  potential  uses  of  these  data  sources  for  surveillance  and  related 
activities.  The  background  information  provided  on  the  methods  used  to  collect 
different  types  of  data  serves,  however,  as  a  starting  point  for  a  more  detailed 
assessment  of  the  strengths  and  limitations  of  these  data  systems  for  surveillance  of 
a  particular  health  event.  The  sources  of  data  mentioned  in  this  chapter  are  listed 
separately  in  Appendix  A. 

Information  on  the  availability  of  routinely  collected  health  and  population  data  are 
available  from  a  variety  of  sources.  Federal  agencies  that  provide  data  in  the  United 
States  include  the  following  organizations: 

the  Centers  for  Disease  Control  (CDC) ,  including  the  National  Center  for 

Health  Statistics  (NCHS); 

the  National  Institute  of  Health  (NIH) ,  including  the  National  Cancer 

Institute  (NCI),  the  National  Heart,  Lung,  and  Blood  Institute  (NHLBI) , 

the  National  Institute  on  Drug  Abuse  (NIDA) , 

the  National  Institute  on  Alcohol  Abuse  and  Alcoholism  (NIAAA) ,  and  the 

National  Institute  for  Mental  Health  (NIMH)  ,- 

the  Food  and  Drug  Administration  (FDA) ; 

the  Agency  for  Health  Care  Planning  and  Research  (AHCPR) ; 

the  Indian  Health  Service  (IHS)  ,- 

the  Health  Care  Financing  Administration  (HCFA) ; 
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•      the  National  Highway  Traffic  Administration  (NHTA) ; 
the  Consumer  Product  Safety  Commission  (CPSC) ;  and 
the  Bureau  of  the  Census 

State  health  departments  also  routinely  collect  health  information,  some  of  which  is 
not  available  from  federal  sources;  and  private  organizations  (e.g.,  the  Public  Health 
Foundation  and  the  National  Association  of  Health  Data  Organizations)  either  have 
health  information  or  maintain  inventories  of  information  that  can  be  obtained  from 
other  sources. 

Information  is  available  in  other  countries  from  similar  national  or  local  agencies 
{1-4) .   The  United  Nations  and  the  World  Health  Organization  (WHO)  routinely  publish 
population  estimates  and  summary  information  on  mortality  and  natality  in  member 
countries  (5-6)  .  Health  and  demographic  information  is  also  available  from  regional 
offices  such  as  WHO/Europe  (7). 

NOTIFIABLE  DISEASE  AND  RELATED  REPORTING  MECHANISMS 

Overview 

Reporting  on  notifiable  diseases  at  the  national  level  originated  in  the  United  States 
in  1878,  when  Congress  authorized  the  United  States  Public  Health  Service  (PHS)  to 
collect  reports  on  morbidity  from  cholera,  smallpox,  plague,  and  yellow  fever,  each  of 
which  was  controlled  through  quarantine  measures  (8,9).   Although  initially  focused  on 
foreign  ports,  authority  for  weekly  reporting  was  expanded  in  1893  to  include  states 
and  municipal  authorities  (9).   To  increase  uniformity,  the  Surgeon  General  was 
authorized  in  1902  to  provide  forms  for  the  collection,  completion,  and  publication  of 
reports  at  the  national  level .  Weekly  telegraphic  reporting  was  recommended  for  a  few 
diseases  in  1903,  and  by  1928,  all  states,  the  District  of  Columbia,  Hawaii,  and 
Puerto  Rico  were  participating  in  national  reporting  of  specified  conditions  (8). 
Compulsory  notification  for  selected  infectious  diseases  was  also  instituted  in  many 
other  countries  in  the  late  1800s,  including  Japan  (1880),  Scotland  (1887),  Italy 
(1888),  England  and  Wales  (1889),  and  Northern  Ireland  (1899)  (2,3,10). 

The  list  of  diseases  for  which  notification  is  recommended  has  changed  over  time,  and, 
although  there  is  overlap,  the  lists  vary  from  jurisdiction  to  jurisdiction.  In  the 
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United  States,  for  instance,  47  infectious  diseases  were  considered  notifiable  at  the 
national  level  in  1989  and  were  reported  to  CDC  through  the  National  Notifiable 
Disease  Surveillance  System  (NNDSS)  (11)  .      In  at  least  one  state,  however,  reporting 
was  required  for  over  160  infectious  diseases  or  related  conditions,  90  occupational 
diseases,  23  other  environmental  diseases,  29  congenital  or  related  conditions,  and 
six  diseases  of  unknown  cause.  With  the  addition  of  Lyme  disease  and  Hemophilus 
influenza  in  1991,  49   infectious  diseases  are  currently  notifiable  at  the  national 
level  in  the  United  States  (12)  .    In  recent  years,  lists  of  notifiable  diseases  in 
other  countries  included  66  diseases  in  Italy  (19  with  rapid  reporting  procedures) ,  32 
in  Scotland  and  in  Japan,  29  in  England  and  Wales,  and  26  in  Northern  Ireland 

(2,3,10).    Procedures  for  modifying  the  list  of  notifiable  diseases  also  vary  from 
country  to  country.  In  the  United  States,  reporting  for  notifiable  diseases  is 
mandated  at  the  state  level  and  the  Council  of  State  and  Territorial  Epidemiologists 

(CSTE) ,  a  consortium  of  epidemiologists  from  all  state  and  territorial  health 
departments,  recommends  a  list  of  conditions  to  be  reported  each  week  to  CDC  (12)  . 
National  reporting  is  required  for  three  quarantinable  diseases--plague,  cholera,  and 
yellow  fever.  Cases  of  these  three  diseases  are  also  reported  to  the  WHO  by  member 
countries. 

In  the  United  States,  occupational  diseases  or  occupation-related  conditions  are 
considered  notifiable  in  some  states,  but  at  present,  occupation-related  conditions 
are  not  reported  nationally  (13,14).    In  1988,  at  least  one  occupation-related 
condition  was  considered  reportable  in  34  states  or  other  jurisdictions.  Lead 
poisoning,  pesticide  poisoning,  and  occupation-related  lung  diseases  are  among  the 
occupation- related  conditions  that  are  reportable  in  many  states. 

In  recent  years,  notifiable-disease-reporting  mechanisms  have  been  used  in  some 
localities  to  collect  information  on  conditions  that  are  not  infectious,  occupation- 
related,  or  vaccine-related.  In  the  United  States,  spinal-cord  injuries,  elevated 
blood  lead  levels  for  children  and  for  occupational ly  exposed  workers,  and  Alzheimer's 
disease  are  among  the  conditions  for  which  reporting  is  required  in  some  localities, 
although  national  reporting  is  not  recommended  by  CSTE  (15-17). 

Reporting  in  the  United  States  for  adverse  events  following  vaccination  or  in 
association  with  the  administration  of  drugs  differs  from  other  notifiable-disease 


39 

reporting  procedures  in  Chat  the  former  types  of  events  are  reported  nationally  rather 
than  to  state  health  departments.  Since  1988,  all  health-care  providers  and  vaccine 
manufacturers  have  been  required  to  report  certain  suspected  adverse  events  following 
specific  vaccinations  (18) .   The  Vaccine  Adverse  Event  Reporting  System  (VAERS)  in 
which  all  reports  of  suspected  adverse  events  following  any  vaccination  are  accepted, 
became  operational  in  1990. 

Adverse  drug  reactions  are  reported  in  the  United  States  to  the  FDA  {19,20).   Drug 
manufacturers  are  required  to  submit  post-approval  reports  of  adverse  drug  reactions 
as  well  as  reports  from  ongoing  clinical  trials  and  selected  reports  from  foreign 
sources.  Reports  submitted  to  manufacturers  by  providers  are  sent  to  the  FDA,  or 
providers  and  patients  can  submit  reports  directly.  Nearly  60,000  reports  were 
submitted  in  1989.  Many  other  countries  have  similar  adverse-drug-reaction  reporting 
systems,  and  about  23  of  these  report  data  to  the  WHO  Collaborating  Center  for 
International  Drug  Monitoring  (21)  .    In  England,  active  surveillance  for  adverse  drug 
effects  in  relation  to  specific  drugs  can  be  monitored  through  the  Prescription  Event 
Monitoring  System,  which  is  funded  through  both  public  and  private  sources  {21,22). 

Data  Collection,  Transmission,  and  Dissemination 

Although  information  on  notifiable  diseases  is  collated  and  published  nationally,  its 
primary  purpose  is  to  direct  local  prevention  and  control  programs.  In  the  United 
States,  information  is  generally  reported  by  clinicians  to  local  or  state  health 
departments.  State  regulations  governing  notifiable  disease  reporting  are  often  quite 
specific  regarding  timeliness  of  reporting.  For  conditions  in  which  an  immediate 
public  health  response  is  needed,  notification  by  telephone  is  usually  mandated, 
either  immediately  or  within  24  hours  of  a  suspected  case.  Other  conditions  are 
generally  reported  on  a  weekly  basis  after  the  diagnosis  has  been  confirmed. 

For  conditions  that  are  reported  nationally  in  the  United  States  through  the  NNDSS,  a 
subset  of  information — including  the  age,  gender,  race,  and  date  of  occurrence  (or 
report) — is  sent  weekly  to  CDC  by  state  health  departments  or  other  jurisdictions  in  a 
standard  format,  either  as  individual  case  reports  or  aggregate  reports.  Personal 
identifiers  are  not  included  in  the  NNDSS.  Since  1990,  all  reporting  states  and 
localities  have  transmitted  information  electronically  to  CDC  through  the  National 
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Electronic  Telecommunications  System  for  Surveillance  (NETSS)  (23) .      National  case 
counts  for  most  notifiable  diseases  are  published  the  week  after  they  are  reported  to 
CDC  in  the  Morbidity  and  Mortality  Weekly  Report    (MMWR)  . 

Most  state  health  departments  also  disseminate  surveillance  data  and  other  public 
health  information  to  health-care  providers  through  weekly  or  monthly  newsletters.  For 
some  conditions,  including  measles,  hepatitis,  syphilis,  and  acquired  immunodeficiency 
syndrome  (AIDS) ,  more  detailed  information  on  risk  factors  and  other  information 
needed  for  disease-control  programs  is  also  collected  by  state  and  local  health 
departments  and,  in  some  instances,  is  sent  to  CDC.  Information  is  also  sent  to  CDC 
through  NETSS  for  conditions  such  as  spinal  cord  injuries,  giardia  infection,  and  Reye 
syndrome,  that  are  not  nationally  notifiable  but  for  which  information  is  useful  at 
the  national  level.  Although  their  use  in  the  United  States  is  limited  primarily  to 
influenza  surveillance,  networks  of  sentinel  health-care  providers  in  many  European 
countries  report  supplemental  information  on  notifiable  diseases  to  local  and  national 
health  officials  (see  below) . 

Surveillance  for  zoonotic  diseases  also  involves  monitoring  animal  hosts  that  either 
transmit  the  disease  directly  to  humans  or  are  also  susceptible  to  the  disease.  For 
various  types  of  encephalitis,  for  instance,  detection  of  elevated  virus  titers  in 
mosquitoes,  wild  birds,  sentinel  flocks  of  chickens,  or  horses  can  signal  that  an 
outbreak  of  human  disease  may  occur  so  that  mosquito-control  activities  can  be 
initiated  {24).   Similarly,  the  potential  for  human  cases  of  rabies  is  assessed  through 
monitoring  wild  skunks,  raccoons,  bats,  and  other  animal  vectors  (25);   the  potential 
for  human  plague  is  assessed  by  monitoring  rodents  in  endemic  areas  (26)  ;  and  Rocky 
Mountain  spotted  fever  and  Lyme  disease  are  monitored  through  testing  of  ticks 
(27,28)  . 

Although  most  cases  of  notifiable  conditions  are  reported  by  clinicians,  the  role 
laboratories  play  in  reporting  notifiable  conditions  is  becoming  increasingly 
important.  In  the  United  States,  many  states  have  developed  reporting  requirements  for 
laboratories  and  hospitals  for  conditions  that  need  laboratory  confirmation  for 
diagnosis  (11,29,30)  .    In  New  York  City,  for  instance,  laboratories  are  required  to 
report  elevated  blood-lead  levels  in  children,  and  at  least  five  states  rely  on 
laboratory  reporting  to  identify  workers  with  elevated  levels  of  lead  or  other  heavy 
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metals  {15) .  Comprehensive,  nationwide  reporting  by  laboratories  is  not  yet  available 
in  the  United  States,  but  in  England,  Wales,  and  Northern  Ireland,  nearly  all 
microbiology  laboratories  voluntarily  report  positive  identifications  of  selected 
conditions  to  the  national  Public  Health  Laboratory  Service  (PHLS)  {10)  . 

Strengths  and  Limitations 

Although  many  diseases  or  conditions  are  considered  notifiable,  compliance  is  poor  in 
most  countries  and  sanctions  are  rarely  enforced.  As  Sherman  and  Langitiuir  noted  in 
1952,  'Our  system  of  notification  of  individual  case  reports  is  a  haphazard  complex  of 
interdependence,  cooperation,  and  goodwill  among  physicians,  nurses,  and  county  and 
state  health  officers,  school  teachers,  sanitarians,  laboratory  technicians, 
secretaries,  and  clerks.  It  is  a  rambling  system  with  variations  as  numerous  as  the 
individual  diseases  for  which  reports  are  requested,  and  as  numerous  as  the  interests 
and  individual  traits  of  the  administrative  health  officers,  epidemiologists,  and 
statisticians  in  [all]  the  . . .  States  and  the  several  federal  agencies  concerned  with 
the  data"  (31).  Indeed,  it  is  remarkable- -given  the  jerry-rigged  nature  of  the  system- 
-that  the  information  collected  is  at  all  useful. 

Under-reporting  is  a  consistent  and  well-characterized  problem  of  notifiable-disease- 
reporting  systems  (see  Chapter  12).  In  the  United  States,  estimates  of  completeness  of 
reporting  range  from  6%  to  90%  for  many  of  the  common  notifiable  diseases  {32) . 
Reporting  is  generally  more  complete  for  conditions  such  as  plague  and  rabies  that 
cause  severe  clinical  illness  with  serious  consequences.  Among  the  many  factors  that 
contribute  to  incomplete  reporting  of  notifiable  conditions  are  lack  of  medical 
consultation  for  mild  illnesses;  concealment  by  patients  or  health-care  providers  of 
conditions  that  might  cause  social  stigma;  lack  of  awareness  of  reporting 
requirements;  lack  of  interest  by  the  medical  community;  incomplete  etiologic 
definition  of  notifiable  conditions;  inadequate  case  definitions  for  surveillance 
purposes;  variation  in  clinical  expertise  in  diagnosing  conditions  in  different  areas; 
changes  in  procedures  for  verifying  reports  from  providers;  variation  in  the  use  of 
laboratory  confirmation;  variation  in  laboratory  procedures;  the  effectiveness  of 
control  measures  in  effect;  and  priorities  of  health  officials  at  local,  state,  and 
national  levels  {9,30,33) .   Similarly,  increased  concern  can  result  in  an  increase  in 
reported  cases.  Public  health  officials  may  actively  solicit  information  if  an 
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outbreak  is  suspected  and  case  reports  may  increase  in  response  to  reports  by  the 
media . 

The  extent  of  under-reporting  can  vary  by  risk  group.  An  evaluation  of  reporting  for 
AIDS  in  Philadelphia  found,  for  instance,  that  under-reporting  was  more  prevalent  for 
those  who  were  employed  in  white-collar  occupations  and  who  had  private  health 
insurance  (34) .  Similarly,  a  review  of  hospital -discharge  data  in  South  Carolina 
indicated  that  AIDS  diagnoses  were  less  likely  to  be  reported  for  whites  over  40  years 
of  age  1.35)  . 

Changes  in  case  definitions  and  the  extent  to  which  laboratory  confirmation  is 
required  for  reporting  can  also  affect  reporting  for  notifiable  conditions.  In  the 
United  States,  a  1984  survey  of  state  epidemiologists  found  substantial  variation  in 
definitions  used  for  communicable  disease  surveillance  by  state  health  departments. 
Since  then,  surveillance  case  definitions  have  been  developed  for  many  communicable 
diseases  and  occupational  conditions,  as  well  as  for  spinal-cord  injuries  (14,17).  The 
degree  to  which  standardized  case  definitions  for  notifiable-disease  reporting  have 
been  adopted  varies,  but  recent  experience  suggests  that  there  will  be  more  important 
changes  in  trends  as  they  are  more  widely  used.  The  1987  revision  of  the  surveillance 
case  definition  for  AIDS  resulted  in  an  increase  in  the  number  of  reported  cases  among 
heterosexual  drug  abusers  (36) .   Changes  in  the  surveillance  case  definition  for 
congenital  syphilis  resulted  in  a  5-fold  increase  in  cases  in  some  reporting  areas 
[37,38)  .   Adoption  of  a  uniform  case  definition  for  Lyme  disease  is  probably  reflected 
in  the  decrease  in  reported  cases  in  the  United  States  in  1990  (39)  . 

The  extent  to  which  clinical  reports  are  confirmed  with  laboratory  findings  can  have  a 
substantial  impact  on  reporting  rates.  For  instance,  malaria  was  endemic  in  the 
southeastern  United  States  in  the  1930s.  Epidemiologic  studies  in  1947  indicated  that 
routine  reporting  of  aggregate  case  counts  based  on  clinical  findings  alone  was  not 
providing  an  accurate  picture  of  current  disease  activity.  When  reporting  of 
individual  cases  with  laboratory  confirmation  was  required,  it  became  clear  that 
endemic  malaria  had  disappeared  between  1935  and  1945,  before  malaria  control  programs 
based  on  drainage  and  indoor  residential  spraying  of  DDT  were  initiated  (40,41).    In 
recent  years,  the  role  of  laboratories  has  been  particularly  important  for 
surveillance  of  the  numerous  subtypes  of  Salmonella,    legionellosis,  nosocomial 


43 

infections,  and  detecting  elevated  blood-lead  levels  (15,30,42)  .   Without  laboratory- 
based  surveillance,  for  instance,  a  large  outbreak  of  drug-resistant  Salmonella 
newport   that  originated  from  animals  fed  antimicrobials  might  not  have  been  detected 
(43)  . 

In  spite  of  their  limitations,  surveillance  systems  based  on  reporting  of  notifiable 
conditions  are  a  mainstay  of  public  health  surveillance.  Unlike  most  other  sources  of 
routinely  collected  data,  information  from  notifiable-disease  systems  is  available 
quickly  and  from  all  jurisdictions.  Knowledge  of  the  specific  characteristics  of 
reporting  for  a  particular  condition  is  helpful  in  interpreting  the  findings.  While 
long-term  trends  may  be  difficult  to  interpret  without  supplemental  information, 
notifiable-disease  systems  can  often  detect  outbreaks  or  other  rapid  changes  in 
disease  incidence  in  a  timely  manner  so  that  control  activities  can  be  initiated.  As 
appropriate,  initial  observations  can  be  evaluated  further  with  additional  studies. 
Notifiable-disease  systems  can  also  detect  changes  in  patterns  of  disease  by 
demographic  characteristics  or  risk  groups.  In  the  United  States,  for  instance,   human 
immunodeficiency  syndrome  (HIV)  and  AIDS  surveillance  systems  have  identified  new  risk 
groups  including  intravenous  drug  abusers  and  their  mates  and  have  highlighted  the 
emerging  problem  of  children  who  are  born  HIV-infected.   Evaluation  of  surveillance 
information  has  also  lead  to  changes  in  disease  prevention  and  control  strategies.   On 
the  basis  of  reports  of  measles  among  elementary  school-,  high  school-,  and  college- 
age  students,  recommendations  for  measles  vaccination  in  the  United  States  were 
recently  changed  to  include  a  two-dose  schedule  (44)  .    Similarly,  because  strategies 
based  on  vaccination  of  high-risk  groups  have  not  been  as  effective  as  originally 
anticipated,  recommendations  for  hepatitis  B  vaccination  have  recently  been  modified 
(45)  . 

In  the  United  States,  reports  of  adverse  drug  reactions  often  result  in  labeling 
changes  for  new  drugs  (19).    Drug  withdrawals  are  infrequent,  although  two  drugs  (an 
antidepressant  and  a  non-steroidal  anti- inflammatory  agent)  have  been  withdrawn  in 
recent  years.  Vaccine  adverse-event-reporting  systems  are  important  for  detecting 
potential  problems  following  administration  of  vaccine,  such  as  an  increase  in 
paralytic  poliomyelitis  among  recently  vaccinated  children  in  the  1950s  and  the 
increase  in  Guillain-Barre  syndrome  following  vaccination  for  swine  influenza 
(18,46,47)  . 
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Notifiable-disease-reporting  mechanisms  have  also  been  important  for  identifying 
unusual  conditions  that  appear  to  be  increasing  and  for  obtaining  a  preliminary 
assessment  of  their  public  health  impact.  Among  the  more  recent  examples  in  the  United 
States  are  AIDS,  toxic-shock  syndrome,  legionellosis,  Reye  syndrome,  and  eosinophilia- 
myalgia  syndrome  (EMS).  Following  the  initial  report  from  a  state  health  department, 
nationwide  surveillance  for  EMS  using  a  standard  case  definition  was  instituted  within 
a  few  days,  and,  through  additional  studies,  the  putative  agent  was  identified  (48) . 

In  the  future,  reporting  of  notifiable  conditions  may  rely,  in  part,  on  computerized 
data  bases  developed  for  billing  and  other  purposes.  However,  the  utility  of  these 
systems  is  limited  at  present:  first,  because  International   Classification  of  Disease 
(ICD)    codes  are  often  not  used  to  identify  infectious  agents  on  billing  records  and, 
second,  because  information  in  these  large  data  bases  is  not  available  immediately 
(49)    .    In  the  near-term,  improvements  in  notifiable-disease  reporting  in  most  areas 
are  likely  to  be  related  to  increased  reliance  on  laboratory-based  reporting  and  on 
the  use  of  sentinel  health-care  providers  or  sentinel  sites. 

Vital   Statistics 
Overview 

The  systematic  registration  of  vital  events  had  its  origins  in  the  parish  registers  of 
15th  century  Western  Europe  (1)  .     One  of  these  registers,  the  Bills  of  Mortality--a 
weekly  tally  begun  in  1532  of  the  number  of  persons  who  died  in  London  from  plague  and 
other  causes,  was  used  to  study  patterns  of  mortality  by  John  Graunt,  one  of  the  first 
to  use  numerical  methods  to  study  disease  (50) . 

Parish  registers  were  superseded  in  the  19th  century  by  civil  registers  kept  for  legal 
purposes.   Registration  of  vital  events  usually  remains  the  responsibility  of  local 
authorities,  but  the  use  of  standard  procedures  for  collecting,  coding,  and  reporting 
vital  events--f irst  used  systematically  by  William  Farr  in  Great  Britain  the  1830s-- 
allows  information  from  different  jurisdictions  to  be  aggregated,  summarized,  and 
compared.   Farr,  the  first  medical  statistician  in  the  Office  of  the  General 
Registrar  of  England  and  Wales,  recognized  the  importance  of  determining  death  rates 
for  different  segments  of  the  population  using  information  collected  systematically  at 
the  time  of  birth  or  death.  In  the  first  annual  report  to  the  Registrar  General  in 
1839,  Farr  discussed  the  principles  that  should  govern  a  statistical  classification  of 
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disease  and  urged  the  adoption  of  a  uniform  system  {2,51) .  Nomenclature  and 
statistical  classification  systems  initially  developed  by  Farr  and  by  Marc  d'Espine 
form  the  basis  of  the  international  disease  classification  system  used  today. 

Information  collected  at  the  time  of  birth  and  death  is  one  of  the  cornerstones  of 
surveillance  in  both  developed  and  developing  countries.  Today,  about  80  countries  or 
areas  report  statistics  on  vital  events  to  WHO,  which  are  coded  and  tabulated 
according  to  the  ninth  revision  of  the  International   Classification  of  Diseases    (ICD- 
9)     and  represent  about  35%  of  the  deaths  that  occur  each  year  worldwide  (ICD-9)  (52)  . 

Vital  statistics  are  an  important  source  of  information  for  surveillance  because  they 
are  the  only  health-related  data  available  in  many  countries  in  a  standard  format 
(52) .     Also,  they  are  often  the  only  source  of  health  information  available  for  the 
entire  population  and  the  only  source  available  for  estimating  rates  for  small 
geographic  areas.   Vital  statistics  have  been  used  to: 


monitor  long-term  trends  (53-55); 

identify  differences  in  health  status  within  racial  or  other  subgroups  of 
the  population  (55,57); 

assess  differences  by  geographic  area  (58-62)    or  occupation  (50,63); 
monitor  deaths  that  are  generally  considered  preventable  (64-67) ; 
generate  hypotheses  regarding  possible  causes  or  correlates  of  disease 
(68,69)  ; 

conduct  health-planning  activities  (70,72);  and 

monitor  progress  toward  achieving  improved  health  of  the  population 
(7,  72,  73)  . 


The  usefulness  of  vital  statistics  for  surveillance  of  a  particular  health  event 
depends  on  the  characteristics  of  that  health  event,  as  well  as  on  the  procedures  used 
to  collect,  code,  and  summarize  relevant  information.   In  general,  vital  statistics 
will  be  more  useful  for  conditions  that  can  be  ascertained  easily  at  the  time  of  birth 
or  death.   Likewise,  mortality  rates  derived  from  death-certificate  data  will  more 
closely  approximate  true  incidence  for  conditions  with  a  short  clinical  course  that 
are  easy  to  diagnose,  are  easily  identified  as  initiating  a  chain  of  events  leading  to 
death,  and  are  usually  fatal  (52,74-75).  Although  birth  and  death  certificates  are 
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filed  shortly  after  the  event  occurs,  the  process  of  producing  final  vital  statistics 
at  a  national  level  from  these  data  can  take  several  years.   Background  information  on 
the  process  of  producing  vital  statistics,  outlined  here  for  the  United  States,  is 
intended  to  highlight  some  of  the  strengths  and  limitations  of  vital  statistics  for 
public  health  surveillance. 

Birth  and  Death  Certification 

In  the  United  States,  responsibility  for  the  registration  of  birth,  death,  and  fetal 
death  is  vested  in  the  individual  states  and  certain  independent  registration  areas 
(e.g.,  New  York  City)  (77).   States  are  encouraged  to  adopt  standard  certificates 
similar  to  the  "model"  certificate  developed  by  NCHS  in  collaboration  with  other 
groups  although  some  states  modify  the  "model"  certificate  to  comply  with  state  laws 
or  regulations  or  to  meet  their  own  information  needs  (78).   Certificates  are  usually 
filed  with  a  registrar  within  24  hours  in  the  jurisdiction  in  which  the  event 
occurred.   For  birth  certificates,  the  physician  or  attendant  certifies  the  date, 
time,  and  place  of  birth  and  other  hospital  personnel  usually  obtain  information  on 
the  remaining  items  (79).  The  1989  model  birth  certificate  includes  additional 
information  on  perinatal  risk  factors,  such  as  maternal  illnesses  and  complications  of 
labor  and  delivery,  that  will  help  to  improve  surveillance  for  perinatal  events 
(77,  80, 81)  . 

For  death  certificates,  the  funeral  director  is  usually  responsible  for  including  all 
personal  information  about  the  decedent  and  for  assuring  that  medical  information  is 
provided  by  the  physician  who  certifies  the  death  [82).    Information  provided  by  the 
physician  includes  the  cause  of  death  (immediate,  "as  a  consequence  of, "  and 
underlying  causes),  the  interval  between  onset  of  the  condition  and  death,  other 
important  medical  conditions,  the  manner  of  death  (e.g.,  "accident",  homicide,  or 
suicide) ,  whether  an  autopsy  was  performed,  and  whether  the  medical  examiner  or 
coroner  was  notified  of  the  death  (78).   In  most  cases,  information  from  autopsies  and 
reports  from  medical  examiners  or  coroners  are  not  available  at  the  time  the  death 
certificate  is  filed,  although  the  certificate  can  be  amended  when  this  information 
becomes  available.   Local  registrars  assure  that  all  vital  events  that  occurred  in  the 
jurisdiction  are  registered  and  that  required  information  is  provided  on  certificates 
before  they  are  sent  to  the  state  registrar.  Both  state  and  local  registrars  can  ask 
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physicians  or  funeral  directors  for  additional  information  if  the  certificate  is 
considered  incomplete.  State  registrars  are  usually  responsible  for  numbering, 
indexing,  and  binding  certificates  for  permanent  safekeeping.  Also,  state  registrars 
usually  forward  certificates  for  deaths  of  non-residents  to  their  states  of  residence. 

Coding,  Classification,  and  Calculation  of  Rates 

To  calculate  national  death  rates,  the  numbers  of  live  births  is  used  as  denominators 
for  infant  and  maternal  mortality  rates,  and  estimates  of  the  population,  usually 
derived  from  the  censuses  are  used  as  the  denominators  for  other  death  rates  {51,83). 
Conditions  are  classified  and  rates  are  calculated  according  to  the  ninth  revision  of 
the  ICD-9  developed  through  the  WHO  and  in  use  since  1979.  The  ICD-9  includes  a 
tabular  list  of  categories  and  conditions  with  code  numbers,  definitions  of  key  terms 
(e.g.,  underlying  cause  of  death,  low  birth  weight),  rules  for  selecting  the 
underlying  cause  of  death,  and  lists  of  conditions  for  statistical  summaries. 

Age-standardized  rates  are  usually  calculated  when  summary  rates  are  compared  in  order 
to  control  for  the  effects  of  differences  in  age  structure  between  compared 
populations  (see  Chapter  V).   In  the  United  States,  the  age  distribution  of  the  U.S. 
population  in  1940  is  usually  used  as  the  standard  for  vital  statistics  {84,85). 
Other  age  distributions--such  as  the  world  standard  population  and  the  European 
standard  population--are  often  used  for  international  comparisons  (See  Chapter  5)  (5). 

In  the  United  States,  about  half  the  states  submit  both  medical  and  demographic  data 
from  certificates  to  NCHS  in  computerized  form  {84,85).     Final  national  mortality  and 
natality  data  are  generally  not  available  from  NCHS  for  at  least  20  months  after  the 
close  of  the  calendar  year,  although  a  written  report  based  on  a  10%  sample  of  deaths 
is  available  within  a  few  months.  Final  data  are  often  available  more  quickly  from 
individual  states.  Similarly,  final  mortality  and  natality  data  are  generally 
available,  with  indices  of  quality  and  completeness,  within  2-3  years  for  countries 
that  routinely  report  data  to  WHO  (5) 

Comparability  and  Quality  Control 

The  quality  of  vital-statistics  information  depends  on  various  factors,  including  the 
completeness  of  registration,  the  relevance  of  the  categories  used  for  diseases, 


48 

injuries,  and  other  conditions;  the  accuracy  of  demographic  and  medical  data  provided 
on  certificates;  and  the  translation  of  this  information  into  computerized  data 
(including  its  categorization  and  coding) .   When  rates  are  calculated,  estimates  are 
also  affected  by  the  accuracy  of  the  population  estimates  or  other  estimates  used  for 
denominators.   Differences  in  access  to  medical  care,  diagnostic  practices,  and 
interpretation  of  coding  rules  will  also  affect  comparability. 

Registration  and  medical  certification  of  deaths  is  virtually  complete  in  most 
developed  countries  [86) .      Population  estimates  used  to  calculate  rates  in  developed 
countries  are  usually  derived  from  censuses  conducted  at  regular  intervals  (usually 
every  10  years),  in  which  the  total  population  is  enumerated  (6).      Inter-censal 
estimates  are  derived  by  adjusting  census  figures  for  birth,  death,  and  migration 
patterns  in  the  intervening  years.   In  some  countries,  population  estimates  are 
derived  from  surveys  or  from  continuous  population  registers.   Through  the  United 
Nations,  population  estimates,  including  indices  of  the  quality  and  completeness  of 
these  estimates,  are  available  for  about  220  countries  or  areas  of  the  world. 

Population  under-counts  can  have  a  measurable  impact  on  mortality  rates;  rates  will  be 
inflated,  for  instance,  if  population  estimates  used  for  the  denominator  are  too 
small.   In  the  United  States,  for  instance,  the  1980  age-adjusted  death  rate  (1940  age 
standard)  from  all  causes  would  decrease  by  1.1%  if  the  population  estimate  from  the 
1980  census  was  adjusted  for  under-counts  (85)  .      Effects  are  even  greater  for 
subgroups  of  the  population.  For  homicides  and  deaths  resulting  from  legal 
intervention  in  the  United  States  in  1980,  adjustment  for  census  under-count  would 
change  the  ratio  of  death  rates  for  black  to  white  men  ages  35-39  years  from  7.3  to 
6.2--a  decrease  of  nearly  18%. 

When  cause-specific  rates  are  compared,  both  the  extent  to  which  information  on  birth 
and  death  certificates  is  reported  completely  and  accurately  and  the  precision  of 
population  estimates  will  affect  the  magnitude  and  the  comparability  of  rates.  The 
impact  of  these  factors  is  likely  to  be  of  less  importance  for  aggregated  cause-of- 
death  categories.   Nonetheless,  comparisons  between  different  geographic  areas  or 
different  population  subgroups  should  be  interpreted  cautiously. 
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Mortality  from  "signs,  symptoms,  and  ill-defined  conditions"  is  often  used  as  an 
indicator  of  the  care  and  consideration  given  by  medical  certifiers  to  completing 
certificates  (ICD-9  780-799).   In  recent  years,  countries  in  which  'signs,  symptoms, 
and  ill-defined  conditions"  were  coded  as  the  underlying  cause  of  death  ranged  from 
less  than  1%  for  Australia,  Czechoslovakia,  Finland,  Hungary,  New  Zealand,  Sweden,  and 
the  United  Kingdom  to  5%-10%  for  Belgium,  France,  Greece,  Israel,  Poland,  Portugal, 
and  Yugoslavia  (86)  .      In  the  United  States,  1.4%  of  deaths  in  1988  were  coded  as 
"signs,  symptoms,  and  ill-defined  conditions,"  with  a  range  among  the  states  of  0.4% 
to  4.1%  (85) . 

The  impact  of  these  factors  on  international  comparisons  has  been  assessed  for  cancer 
and  for  respiratory  disease  (76,76).   Within  the  United  States,  differences  in 
completeness  and  accuracy  of  certificates  have  also  been  noted  within  racial  and 
ethnic  subgroups  (87)  . 

A  variety  of  approaches  will  facilitate  improvement  in  the  quality  of  information  on 
birth  and  death  certificates.   These  include  providing  physicians  and  funeral 
directors  clearer  instructions  for  completing  the  certificates  and  more  effective 
training  regarding  the  importance  of  vital  statistics  and  the  importance  of  following 
recommended  procedures  for  completing  both  the  medical  and  demographic  sections  of 
certificates  (77, 88,89)  .      State  and  local  registrars  can  increase  the  extent  to  which 
they  contact  physicians  and  funeral  directors  when  information  provided  on 
certificates  is  not  considered  complete  and  can  facilitate  amendment  of  certificates 
when  additional  information  is  available  from  autopsies  or  other  sources. 

In  spite  of  limitations,  birth  and  death  certificates  are  an  important  source  of 
information  for  cost-efficient  surveillance  of  a  wide  range  of  health  events  at  local, 
national,  and  international  levels.   Although  differences  in  rates  may  not  always 
reflect  actual  differences  in  disease  and  injury  burden,  routine  analysis  of 
information  obtained  at  the  time  of  birth  and  death  can  highlight  areas  in  which 
further  investigation  of  a  health  event  is  warranted. 

Examples  of  Surveillance  Systems  Based  on  Vital  Statistics  and 
Related  Data 
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Weekly  reports.   As  part  of  the  national  influenza  surveillance  effort  in  the  United 
States,  vital  registrars  in  121  U.S.  cities  report  to  CDC  each  week  the  number  of 
deaths  that  have  occurred  in  those  jurisdictions  (90)  .      This  121-City  Surveillance 
System  has  been  operational  since  1952.  The  total  number  of  deaths  and  the  number 
attributed  to  pneumonia  and  influenza  by  age  group  are  reported,  and  the  total  number 
of  deaths  by  age,  city,  and  region  are  published  within  a  week  of  receipt  in  the  MMWR. 
About  one-third  of  the  deaths  that  occur  in  the  United  States  are  reported  through  the 
121-City  Surveillance  System,  and  most  are  reported  to  CDC  within  2-3  weeks  of 
occurrence.   Mortality  rates  based  on  the  121-City  system  cannot  be  directly  compared 
with  rates  derived  from  final  mortality  data.  However,  the  121-City  system  does 
detect  short-term  increases  in  deaths  from  influenza  and  pneumonia  in  a  timely  manner 
as  needed  for  public  health  intervention.   Increases  in  mortality  from  other  causes- 
including  mortality  during  heat  waves  and  increased  deaths  from  pneumonia  and 
influenza  among  young  men  (later  linked  to  AIDS)--  have  also  been  detected  using  the 
121-City  system. 

Monthly  or  quarterly  reports.   In  the  United  States,  final  mortality  data  are 
generally  not  available  for  nearly  2  years,  although  provisional  estimates  are 
published  by  NCHS  within  3-4  months  in  the  Monthly  Vital   Statistics  Report    (MVSR) .   The 
Current  Mortality  Sample,  a  10%  systematic  sample  of  certificates,  is  sent  to  NCHS 
each  month  by  state  registrars.   On  the  basis  of  this  sample,  provisional  estimates  of 
total  monthly  mortality  by  age,  race  (white,  black,  other),  gender,  state,  and  region 
are  published  about  3  months  later,  and  provisional  rates  from  72  selected  causes  are 
published  the  following  month.  Provisional  rates  are  published  by  place  of  occurrence 
while  final  rates  are  published  by  place  of  residence.  For  the  Mortality  Surveillance 
System  (MSS),  time-series  regression  models  are  fitted  using  monthly  data,  and  charts 
displaying  monthly  estimates  and  the  fitted  model  for  specific  conditions  are 
published  each  month  in  the  MVSR. 

The  Current  Mortality  Sample  and  the  MSS  are  very  useful  for  monitoring  overall  trends 
in  total  mortality  and  for  monitoring  trends  in  relatively  common  causes  of  death  that 
are  increasing  or  decreasing  over  time  (e.g.,  heart  disease,  homicide,  lung  cancer, 
HIV /AIDS)  .   Although  estimates  are  adjusted  for  under-reporting,  monthly  changes  in 
mortality  for  conditions  for  which  supplemental  information  is  often  needed  should  be 
interpreted  with  caution. 
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Infant  mortality  and  other  adverse  reproductive  outcomes.   Linking  information 
from  death  certificates  for  infants  with  information  on  maternal  characteristics  and 
other  information  from  birth  certificates  is  useful  for  assessing  potentially 
preventable  mortality  by  geographic  area  and  within  subgroups  of  the  population.   In 
England  and  Wales,  birth  and  death  records  for  infants  were  linked  for  infants  born  in 
1949-1950  and  again  for  infants  who  died  from  April  1954  to  March  1965  (2) .   All 
births  and  deaths  of  infants  have  been  linked  routinely  in  England  and  Wales  since 
1975.   In  the  United  States,  birth  and  death  certificates  have  been  linked  for  infants 
born  from  1983  to  1986  (PI).   Approximately  40,000  infants  die  each  year  in  the  United 
States,  and  at  least  98%  of  the  death  certificates  for  infants  have  been  linked  to 
birth  certificates  in  these  years.  This  information  is  also  useful  for  health 
planning  and  for  targeting  services,  since  U.S.  infant  mortality  rates  vary 
considerably  by  geographic  area  and  within  demographic  subgroups. 

Information  on  birth  certificates  has  also  been  used  to  identify  high-risk  mothers  who 
need  supportive  services  for  infant  care.   In  Michigan,  for  instance,  information  on 
birth  certificates  is  transmitted  electronically  from  hospitals  to  the  state  health 
department  (91)  .     Key  information  is  then  sent  to  county  health  departments  so  that 
public  health  nurses  can  be  assigned  to  areas  with  the  greatest  need. 

Occupational  mortality.   William  Farr  was  the  first  to  evaluate  systematically  the 
associations  between  occupation  and  cause  of  death  (50) .     The  Decennial  Supplement  on 
Occupational  Mortality  for  England  and  Wales   has  been  published  approximately  every  10 
years  since  1855  (1,2).     Cause-specific  rates  and  ratios  by  occupation,  adjusted  for 
social  class,  are  estimated  using  information  derived  from  death  certificates  and  from 
the  decennial  census  (63) .      Although  estimates  are  affected  by  sources  of  error  in 
both  data  sets,  occupation-specific  mortality  rates  are  useful  for  identifying 
occupations  for  which  more  detailed  studies  may  be  warranted  (92)  . 

In  the  United  States,  usual  occupation  (even  if  retired)  and  industry  are  included  on 
the  standard  death  certificate  (85) .   The  states  are  not  required  to  report  this 
information  to  NCHS,  but  if  it  is  submitted,  it  has  been  included  since  1985  in  the 
computerized  final  mortality  files  using  the  Standard  Occupational  Classification   and 
Standard  Industry  Classification   systems.   In  1987,  14  states  reported  information  on 
occupation  and  industry  to  NCHS  and  in  1989,  occupation  and  industry  during  the  last 
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year  for  both  mother  and  father  were  added  to  the  standard  certificate  for  deaths  of 
fetuses  (77).   Through  the  National  Traumatic  Occupational  Fatalities  (NTOF) 
surveillance  system,  CDC's  National  Institute  for  Occupational  Safety  and  Health 
(NIOSH)  obtains  additional  information  for  work-related  traumatic  deaths  that  is 
included  on  death  certificates  but  that  is  not  coded  and  computerized  routinely  in  all 
states  (93)  .     State-  and  industry-specific  rates  are  derived  using  estimates  of  the 
employed  population  from  the  Bureau  of  Labor  Statistics.  Analyses  from  the  NTOF 
suggest  that  traumatic  occupational  fatality  rates  decreased  in  the  United  States 
between  1980  and  1985,  although,  in  some  instances,  large  differences  were  found  in 
fatality  rates  by  gender  and  by  state  within  the  same  industry. 

Supplemental  information  from  other  sources.   Other  sources  of  information  may  be 
available  on  the  circumstances  leading  to  death.  In  the  United  States,  medical 
examiners  and  coroners  are  responsible  for  investigating  sudden  and  unexpected  deaths - 
-  homicides,  suicides,  deaths  from  unintentional  injuries,  and  unanticipated  deaths 
from  natural  causes--which  account  for  about  20%  of  all  deaths  each  year.   Reports 
from  medical  examiners  and  coroners  include  detailed  information  on  the  circumstances 
surrounding  death,  results  of  laboratory  analyses  for  alcohol  and  drugs,  and  other 
relevant  information.  These  reports  have  been  used,  for  instance,  to  investigate 
deaths  associated  with  horseback  riding,  drug  abuse,  hurricanes,  earthquakes,  and  heat 
waves  (.94-98)  .    In  1990,  through  the  Medical  Examiner/Coroner  Information  Sharing 
Program,  data  from  investigations  of  death  were  reported  to  CDC's  National  Center  for 
Environmental  Health  (NCEH)  in  a  computerized  format  from  nine  state  and  eight  county 
medical-examiners'  offices  (R.G.  Parrish,  personal  communication). 

Additional  information  on  fatalities  is  often  available  from  other  sources.   In  the 
United  States,  for  instance,  the  Fatal  Accident  Reporting  System  (FARS)  from  the  NHTA 
has  been  used  to  investigate  the  association  between  use  of  child  restraints  and 
motor-vehicle-related  crashes  (99)    and  the  association  between  premature  mortality  and 
alcohol-related  traffic  crashes  (100) .     The  relationship  between  homicide  and  the 
prevalence  of  hand-gun  ownership  in  the  United  States  and  Canada  has  been  investigated 
using  data  from  uniform  crime-reporting  registries  of  all  homicides  and  aggravated 
assaults  maintained  by  the  Federal  Bureau  of  Investigation  in  the  United  States  and 
the  Centre  for  Justice  Statistics  in  Canada  (102) .  Other  sources--such  as  police, 
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ambulance,  and  fire  reports--may  also  include  information  that  is  useful  for 
surveillance  of  particular  health  events. 

SENTINEL    SURVEILLANCE 

Overview 

The  term  "sentinel  surveillance"  encompasses  a  wide  range  of  activities  focused  on  the 
monitoring  of  key  health  indicators  in  the  general  population  or  in  special 
populations.  Characteristics  of  these  activities  vary  considerably,  but,  in  general, 
their  primary  intent  is  to  obtain  timely  information  needed  for  public  health  or 
medical  action  in  a  relatively  inexpensive  manner  rather  than  to  derive  precise 
estimates  of  prevalence  or  incidence  in  the  general  population.  The  term  "sentinel" 
has  been  applied  to  key  health  events  that  may  serve  as  an  early  warning  or  represent 
the  tip  of  the  iceberg;  to  clinics  or  other  sites  where  health  events  are  monitored; 
or  to  networks  of  health-care  providers  who  agree  to  report  information  on  one  or  more 
health  events.   A  sentinel  health  event,  according  to  Rutstein,  is  a  "preventable 
disease,  disability,  or  untimely  death  whose  occurrence  serves  as  a  warning  signal 
that  the  quality  of  preventative  and/or  therapeutic  medical  care  may  need  to  be 
improved"  (102).      Sentinel  surveillance,  according  to  Woodhall,  represents  "an  attempt 
to  find  a  system  that  would  provide  a  measure  of  disease  incidence  in  a  country  in  the 
absence  of  good  nation-wide  institution-based  surveillance  without  having  to  resort  to 
large  expensive  surveys"  (103) .    Sentinel  surveillance  systems  are  not  limited  to 
developing  countries.   In  Europe,  routine  morbidity  surveillance  is  often  conducted  by 
networks  of  primary  care  providers  who  routinely  report  information  on  conditions  that 
are  relatively  common  in  general  practice  (104,105). 

Sentinel  Health  Events 

Sentinel  health  events  are  monitored  for  many  different  public  health  programs.   In 
the  United  States,  sentinel  surveillance  for  maternal  mortality,  first  used  in  New 
York  City  in  the  19  30s,  was  associated  with  a  rapid  decline  in  mortality  associated 
with  childbirth.   For  each  case,  medical  panels  reviewed  pertinent  records  to  identify 
missed  opportunities  that  might  have  prevented  a  presumably  unnecessary  death. 
Similar  methods  have  been  used  to  monitor  deaths  of  infants.   In  Massachusetts,  review 
of  records  indicated  that,  in  1967-1968,  about  one-third  of  the  deaths  of  infants 
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could  have  been  prevented  by  medical  intervention  (.102).      Monitoring  preventable 
conditions  can  also  highlight  more  general  problems.   For  instance,  a  review  of  deaths 
among  infants  from  Rh  hemolytic  disease,  about  90%  of  which  are  considered 
preventable,  indicated  that  mothers  of  many  affected  infants  did  not  have  medical 
insurance  coverage  (106)  .     Quality  of  care  has  also  been  evaluated  using  conditions 
for  which  death  or  disability  could  have  been  prevented  including  evaluation  of 
hospital-based  mortality  rates  after  adjustment  for  certain  patient  characteristics 
(107-109)  . 

Sentinel  surveillance  activities  have  been  particularly  useful  for  identifying  health 
events  that  may  be  related  to  occupational  exposures.   Lists  of  occupation-related 
health  events  have  been  developed,  some  of  which  (e.g.,  mesothelioma  and  angiosarcoma 
of  the  liver)  are  specifically  tied  to  environmental  or  occupation  exposure,  and  some 
of  which  (e.g.,  lung  cancer  and  bladder  cancer)  have  other  risk  factors  as  well  (102). 
Mesothelioma,  for  instance,  is  a  rare  form  of  cancer  specifically  associated  with 
exposure  to  asbestos  that  may  identify  the  "tip  of  the  iceberg"  of  asbestos-related 
disease  in  an  industry  in  which  workers  develop  more  common  conditions,  such  as  lung 
cancer  and  chronic  obstructive  pulmonary  disease. 

In  the  United  States,  NIOSH  has  developed  the  Sentinel  Event  Notification  System  for 
Occupational  Risks  (SENSOR)  program,  which  focuses  on  surveillance  of  specific 
occupational  conditions  by  networks  of  sentinel  providers  (210) .  Target  conditions 
monitored  by  at  least  one  of  the  10  states  initially  included  in  the  program  include 
silicosis,  occupational  asthma,  pesticide  poisoning,  lead  poisoning,  and  carpal-tunnel 
syndrome.   When  cases  identified  by  sentinel  providers,  (usually  physicians  who 
practice  occupational  medicine)  are  found  to  be  occupation-related,  intervention 
activities  are  undertaken  by  state  health  departments  in  order  to  prevent  additional 
cases.   Although  primarily  used  for  case  identification  and  follow-up,  information 
derived  from  SENSOR  projects  may  augment  other  sources  of  information  on  trends  for 
occupation-related  disorders. 

Health  indicators  that  are  monitored  in  many  different  countries  could  also  be 
considered  sentinel  health  events.   Infant-mortality  rates,  for  instance,  are  used  in 
both  developing  and  developed  countries  as  an  indicator  of  the  availability  and  the 
quality  of  medical  care.   In  Europe  and  the  United  States,  additional  health 
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indicators  are  monitored  routinely  to  assess  the  general  health  of  the  population.   In 
Europe,  22  key  health  indicators  have  been  monitored  routinely  since  1986  through 
WHO'S  Health  for  All  activity  in  order  to  compare  progress  toward  reducing  preventable 
morbidity  and  mortality  in  participating  countries  (7).   In  the  United  States, 
specific  goals  and  objectives  for  improving  the  nation's  health  are  monitored  using 
key  health  indicators.   Goals  and  objectives  initially  developed  for  1990  have  been 
revised  and  expanded  for  the  Year  2000  so  that  progress  toward  attainment  of  specific 
objectives  can  be  monitored  quantitatively  (73).     A  total  of  226  goals  and  objectives 
for  the  Year  2000  has  been  proposed  for  use  in  monitoring  health  status  at  the 
national  level  and  a  subset  of  18  indicators  has  been  selected  for  monitoring  by  all 
levels  of  government  (112)  .  Most  of  these  18  community-health-status  indicators  are 
based  on  vital  statistics  and  data  from  the  NNDSS. 

Sentinel  Sites 

Sentinel  hospitals,  clinics,  and  counties  can  often  provide  timely,  information  on  a 
wide  range  of  health  conditions  that  is  not  available  from  other  sources.   Although 
information  is  generally  not  available  for  the  entire  population,  sentinel  systems  in 
both  developing  and  developed  countries  can  provide  sufficient  information  for  making 
public  health  decisions  and  for  detecting  long-term  trends.   In  developing  countries, 
the  WHO  Expanded  Project  on  Immunization  uses  sentinel  hospitals  and  clinics  in  25 
target  cities  to  monitor  the  impact  of  vaccination  on  the  incidence  of  neonatal 
tetanus,  poliomyelitis,  diphtheria,  measles,  pertussis,  and  tuberculosis  (203).   After 
initial  contact  with  many  hospitals  and  clinics,  officials  choose  sentinel  sites  that 
serve  populations  as  similar  as  possible  to  the  general  population.   In  developed 
countries,  sentinel  providers,  hospitals,  and  clinics  are  used  to  monitor  conditions 
for  which  information  is  not  otherwise  available.  Sentinel  primary-care  providers 
report  information  on  conditions  seen  in  ambulatory  settings,  while  sentinel  sites-- 
such  as  drug,  sexually  transmitted  disease,  and  maternal  and  child  health  clinics-- 
monitor  conditions  in  subgroups  that  may  be  more  vulnerable  than  the  general 
population. 

Sentinel  hospitals,  clinics,  and  counties  can  also  provide  public  health  information 
that  is  not  readily  available  from  other  sources.   In  the  United  States,  for  instance, 
viral  hepatitis  is  a  notifiable  disease,  but  non-A  non-B  hepatitis  (most  of  which  is 
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hepatitis  C)  is  under-reported,  and  not  all  of  the  detailed  information  on  serology, 
demographics,  and  routes  of  transmission  needed  for  monitoring  is  routinely  available. 
To  obtain  such  information,  patients  with  hepatitis  reported  to  four  county  health 
departments  are  interviewed,  are  tested  serologically  at  regular  intervals  after  the 
onset  of  illness,  and  are  followed  prospectively  to  determine  whether  they  have 
acquired  hepatitis  B  or  hepatitis  C-related  chronic  liver  disease  (112,123).  Taken 
together,  these  sentinel  counties  are  intended  to  be  representative  of  the  incidence 
and  epidemiologic  characteristics  of  hepatitis  B  in  the  United  States.   Findings  from 
these  sentinel  counties  have  highlighted  the  increasing  importance  of  parenteral  drug 
use  in  the  transmission  of  both  hepatitis  B  and  C. 

Surveillance  from  sentinel  sites  is  also  used  in  the  United  States  for  surveillance  of 
HIV  infection  (114)  .     Since  the  epidemic  of  HIV  comprises  multiple  sub-epidemics  in 
different  population  groups  and  different  geographic  areas,  progression  of  the 
epidemic  can  be  monitored  by  targeting  surveillance  efforts  directed  at  groups  who  are 
at  increased  risk  of  HIV  infection.   The  use  of  standardized  survey  methods  and 
serologic  testing  procedures  facilitates  comparison  of  findings  from  the  different 
groups.   Included  in  the  HIV  family  of  surveys  are  studies  of  groups  that  receive  care 
through  publicly- funded  clinics--including  those  for  tuberculosis,  drug  treatment, 
sexually  transmitted  disease,  family  planning,  and  prenatal  care.  Other  sentinel 
groups  in  which  HIV  prevalence  is  monitored  include  hospital  patients  with  diagnoses 
that  are  not  likely  to  be  associated  with  HIV  infection,  women  at  the  time  of 
childbirth,  blood  donors,  military  recruits.  Job  Corps  applicants,  university 
students,  prisoners,  migrant  farm  workers,  and  homeless  persons.  Findings  from  HIV 
sentinel  surveillance  systems  have  been  used  to  monitor  progression  of  the  epidemic  in 
vulnerable  populations  and  to  estimate  prevalence  in  the  community  at  large. 

Sentinel  Providers 

Networks  of  sentinel  general  or  family  practitioners  and  other  primary  care  providers 
are  active  in  many  European  countries  and  in  the  United  States,  Canada,  Israel, 
Australia,  New  Zealand,  and  other  countries  (115-117) .      Providers  in  some  of  these 
networks  conduct  independent  research  projects,  but  many  of  them--particularly  in 
Europe  and  Australia- -report  surveillance  data  that  are  used  by  national  health 
agencies.   Primary-care  practitioners  can  provide  timely  information  for  surveillance 
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because  they  generally  provide  the  first  professional  judgment  for  medical  problems 
that  are  seen  in  early  stages.   In  most  networks,  primary-care  physicians  report  a 
minimum  amount  of  information,  usually  at  weekly  intervals,  on  a  select  group  of 
health  events  that  are  relatively  common  in  general  practice.   A  wide  range  of  health 
events  are  reported  by  these  networks  including  the  following:   infectious  diseases 
that  are  and  are  not  notifiable  in  that  country;  conditions  such  as  dementia,  gastric 
ulcers,  multiple  sclerosis,  acute  pesticide  poisoning,  and  drug  abuse;  and  requests 
for  services,  such  as  mammography,  cervical  smears,  and  testing  for  HIV  (104) . 
Although  most  systems  are  based  on  reports  by  primary-care  practitioners,  the  extent 
to  which  rates  can  be  calculated  that  reflect  morbidity  in  the  general  population  is 
related  in  large  part  to  the  manner  in  which  medicine  is  organized  and  practiced  in 
that  country.  For  instance,  morbidity  reporting  by  sentinel  general  practitioners 
would  more  closely  approximate  morbidity  in  the  general  population  in  countries  with 
universal  health-care  coverage  in  which  patients  are  assigned  to  the  same  provider  or 
group  of  providers,  in  which  specialists  are  seen  only  by  referral,  and  in  which 
sentinel  providers  are  selected  that  serve  populations  that  are  demographically 
similar  to  the  general  population.  None  of  the  existing  networks  meet  all  of  these 
criteria,  and  the  most  enduring  networks  are  usually  characterized  by  highly  motivated 
volunteer  providers  who  report  information  consistently  over  time,  when  the 
population  from  which  patients  is  drawn  cannot  be  characterized,  the  number  of  cases 
relative  to  the  total  number  of  patients  seen  or  the  number  of  reporting  physicians  is 
usually  monitored.  Regardless  of  the  strengths  and  limitations  of  each  network,  most 
are  able  to  provide  preliminary  descriptive  information  in  a  timely  manner  for  health 
events  seen  in  ambulatory-care  settings  for  which  information  is  not  otherwise 
available . 

A  recent  survey  by  Eurosentinel,  a  newly- formed  consortium  funded  by  the  European 
Economic  Community  to  coordinate  activities  of  sentinel  general -practitioner  networks, 
found  that,  as  of  March  1990,  there  were  at  least  39  active  networks  in  Europe  (104)  . 
Among  the  more  established  networks  are  those  in  Great  Britain,  the  Netherlands, 
Belgium,  and  France.   Ten  of  these  participated  in  joint  data-collection  efforts 
including  weekly  reporting  of  mumps,  measles,  and  influenza-like  illness,  and  studies 
of  the  use  of  selected  laboratory  tests  in  general  practice  and  of  requests  for  HIV- 
testing  (105,118)  . 
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The  oldest  sentinel -provider  network  in  Europe,  the  Weekly  Returns  Service,  was 
organized  by  the  Royal  College  of  General  Practitioners  in  Great  Britain  and  has  been 
in  continuous  operation  since  1962  (104)  .      In  1990,  242  volunteer  general 
practitioners  from  66  practices  in  Great  Britain  reported  weekly  incidence  data  for  44 
conditions  selected  collaboratively  by  participating  practitioners,  epidemiologists, 
and  health-service  providers  (119) .   These  sentinel  providers  report  conditions  for 
about  1%  of  the  population,  and  rates  per  100,000  population  can  be  calculated  using 
information  from  patient  lists.   Reported  conditions  range  from  those  with  official 
notification  procedures  in  Great  Britain  (e.g.,  measles  and  whooping  cough)  to 
conditions  (e.g.,  multiple  sclerosis,  rheumatoid  arthritis,  thyrotoxicosis,  and 
attempted  suicide)  for  which  less  information  is  routinely  available  from  outpatient 
settings  (104,119, 120) .      Information  from  the  Weekly  Returns  Service  has  been 
particularly  useful  for  monitoring  trends  in  influenza  and  related  illnesses  in  Great 
Britain. 

The  Surrey  University  Morbidity  Network,  also  covering  about  1%  of  the  population  of 
Great  Britain,  has  been  operational  since  1974  (104) .      In  1990,  42  infectious  and  non- 
infectious conditions  were  monitored  by  120  practices.  One  of  the  purposes  of  this 
network  is  to  examine  seasonal  and  other  environmental  influences  on  morbidity.   Data 
have  been  collected  and  transmitted  electronically  since  1985,  and  participating 
physicians  receive  reports  regularly. 

A  network  of  sentinel  general  practitioners  has  reported  to  the  Netherlands  Institute 
of  Primary  Health  Care  (NIVEL)  since  1970  (104, 121, 122)  .   The  primary  purpose  of  this 
network,  which  covers  about  1%  of  the  population,  is  to  gather  reliable  epidemiologic 
data  on  health  problems,  as  well  as  on  actions  taken  by  providers  to  address  these 
problems.   In  1990,  45  practices  involving  63  general  practitioners  participated  in 
the  network.   Information  on  16  topics  was  reported  weekly  in  1988-1989,  including 
requests  for  sterilization,  referrals  for  speech  therapy  and  echocardiography,  and 
newly  diagnosed  cases  of  dementia.   Reasonable  estimates  of  morbidity  are  possible 
because  access  to  medical  specialists  is  available  only  by  referral,  a  relatively 
well-defined  population  is  served  by  each  practice,  and  because  practitioners, 
although  volunteers,  are  chosen  so  that  the  distribution  of  their  patients  is  as 
representative  of  the  Dutch  population  as  possible  (121)  .  Many  descriptive  studies 
have  been  published  using  information  provided  by  the  Dutch  network  (121-123)  . 
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The  Belgian  Sentinel  Practice  Network  has  been  operated  by  the  National  Health 
Department  since  1979  (124-126).     Each  year,  about  1,500  general  practitioners  are 
contacted,  about  10%  of  them  usually  agree  to  participate,  and  a  final  group  is 
selected  so  that  their  patients  are  representative  of  the  age  and  sex  distribution  of 
the  general  population.  An  estimated  1.3%  of  the  population  in  Belgium  were  seen  by 
sentinel  practitioners  (104) .      In  1990,  measles,  acute  respiratory  infections,  new 
cases  of  cancer,  suicide  attempts,  and  requests  for  HIV  tests  were  reported  by  the 
network,  in  addition  to  five  officially  notifiable  diseases  (gonorrhea,  infectious 
hepatitis,  meningitis,  syphilis,  and  urethritis).   Dissemination  of  the  information  is 
one  of  the  strengths  of  the  Belgian  network.   Bimonthly  and  annual  reports  are  sent  to 
participating  practitioners,  to  the  Ministry  of  Public  Health,  to  medical  and  public 
health  schools,  to  professional  organizations,  and  to  the  press. 

In  France,  networks  of  sentinel  primary-care  providers  transmit  and  receive 
information  on  selected  conditions  using  computer  terminals  and  modems  available 
nationally  at  low  cost  (127) .   Interactive  electronic  systems  are  used  by  the  national 
French  Communicable  Diseases  Computer  Network  (FDCN) ,  as  well  as  by  local  and  regional 
networks  in  the  cities  of  Toulouse  and  St.  Etienne,  and  in  the  regions  of  Aquitaine, 
France-Sud,  and  Lyon  (104) .     The  largest  network,  the  FDCN,  has  been  operated  by  the 
National  Health  Department  and  the  National  Institute  of  Health  since  1984.   In  1990, 
about  550  volunteer  sentinel  general  practitioners,  about  1%  of  the  number  throughout 
France,  reported  new  cases  of  influenza,  viral  hepatitis,  urethritis  measles,  and 
mumps  each  week,  none  of  which  were  officially  notifiable  (104, 128)  .    Since  the 
underlying  population  seen  by  reporting  physicians  is  not  known,  trends  are  usually 
expressed  as  the  average  number  of  cases  per  reporting  physician  per  week. 
Information  is  also  transmitted  directly  by  national,  hospital,  and  other 
laboratories;  and  local,  regional,  and  national  health  agencies  are  also  included  in 
the  network  (127) .      Electronic  mail  and  bulletin  boards  are  used  to  disseminate 
information,  and  reporting  physicians  can  contact  researchers  and  obtain  literature 
searches  through  the  network. 

Tracking  the  spread  of  influenza-like  illness  using  the  FDCN  has  been  particularly 
effective.  Epidemic  thresholds  can  be  calculated  on  the  basis  of  data  from  previous 
years  and  the  extent  of  regional  spread  can  be  tracked  each  week  (128,129)  .     Unlike 
mortality-based  surveillance  systems,  the  FDCN  was  able  to  show  that  the  1988-1989 
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influenza  epidemic  occurred  earlier,  was  of  shorter  duration,  and  affected  primarily 
young  age  groups  relative  to  epidemics  in  previous  years  (.130)  .      In  addition  to 
routine  surveillance  activities,  the  FDCN  has  been  used  to  conduct  surveys  on 
physician  attitudes  regarding  vaccination  for  measles;  the  use  of  measles,  mumps,  and 
rubella  trivalent  vaccine;  HIV  testing;  and  biologic  testing  for  diarrhea  (104)  . 
Surveys  conducted  before  and  after  a  nationwide  AIDS  campaign  found  that  the  number  of 
tests  given  to  women  and  to  heterosexual  men  increased  following  the  campaign  that 
emphasized  risks  associated  with  heterosexual  activity  (131) .      Studies  of  diarrheal 
disease  have  been  conducted  by  the  Aquitaine  network  (132)  .      Findings  from  the 
Aguitaine  studies,  coupled  with  findings  on  measles  from  the  FDCN,  highlight  that 
localized  outbreaks  of  disease  for  which  public  health  action  is  warranted  can  be 
missed  by  sentinel  networks  that  typically  monitor  conditions  in  about  1%  of  the 
population. 

In  the  United  States,  a  network  of  139  sentinel  physicians  reports  cases  of  influenza- 
like illness  each  week  to  CDC  (47,133).      Nasopharyngeal  specimens  are  sent  by  70 
physicians  to  a  central  laboratory,  which  then  reports  findings  to  reporting 
physicians  and  to  CDC.   Physicians  also  report  the  total  number  of  office  visits  per 
week  so  that  the  percentage  of  visits  by  patients  with  influenza-like  illnesses  can  be 
estimated.   In  1991,  sentinel  physicians  from  the  Middle  Atlantic  and  West  South 
Central  regions  of  the  United  States  reported  increased  visits  for  influenza-like 
illness  by  late  November,  although  numbers  of  such  visits  had  not  yet  increased  in 
other  areas  of  the  country. 

Networks  of  family  practitioners  and  other  primary-care  providers  have  been  formed  in 
the  United  States  and  Canada,  primarily  to  conduct  collaborative  research  projects, 
but  have  the  potential  to  conduct  surveillance.  The  descriptive  and  analytic  studies 
performed  by  these  networks  have  been  very  useful  for  identifying  patterns  of  illness 
in  outpatient  settings.   Unlike  most  networks  in  Europe,  however,  they  have  generally 
not  had  formal  reporting  relationships  with  state  or  local  health  agencies  that  are 
responsible  for  timely  public  health  activities.  The  Ambulatory  Sentinel  Practice 
Network  (ASPN) ,  formed  in  1981,  includes  334  volunteer  clinicians  from  71  practices  in 
the  United  States  and  Canada  most  of  whom  are  family  practitioners  and  many  of  whom 
practice  in  rural  areas  (115,134)  .     Many  studies  conducted  by  ASPN--including  studies 
of  pelvic  inflammatory  disease,   spontaneous  abortion,  chest  pain,  carpal  tunnel 
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syndrome,  and  HIV  prevalence- -have  increased  knowledge  regarding  the  distribution  of 
conditions  with  public  health  impact  among  patients  seen  in  private  ambulatory- care 
settings  (135-138)  . 

The  Pediatric  Research  in  Office  Settings  (PROS)  network,  formed  in  1985  and  sponsored 
by  the  American  Academy  of  Pediatrics,   currently  includes  about  740  practitioners  in 
224  practices  (139)  .     The  PROS  network  has  completed  a  study  of  vision  screening  of 
young  children  and  a  pilot  study  of  febrile  illness  among  infants.  Regional  primary- 
care  networks  include  the  Dartmouth  COOP  project  in  northern  New  Hampshire  and 
Vermont,  the  Upper  Peninsula  Research  Network  in  Michigan,  and  the  Wisconsin  Research 
Network.  Studies  with  public  health  impact  conducted  by  regional  networks  include 
studies  of  cholesterol-,  alcohol-,  and  cancer-screening  activities;  development  of 
methods  to  identify  functional  deficits;  and  development  of  health-maintenance 
protocols  for  use  in  private  practice. 

Many  of  the  established  networks  of  primary-care  providers  participate  in 
international  collaborative  organizations,  such  as  the  International  Primary  Care 
Network  (IPCN),  the  European  Electronic  Adverse  Drug  Reaction  Network  (EEADRN)  and 
Eurosentinel  (104,140)  .     A  recent  IPCN  study  of  3,360  children  from  nine  countries 
showed  that  the  proportion  of  children  with  otitis  treated  with  antibiotics  varied 
widely  between  countries  and  that  antibiotic  treatment  did  not  improve  the  rate  of 
recovery  (117) .   In  association  with  the  British  pharmaceutical  industry,  the  EEADRN 
monitors  adverse  drug  reactions  in  the  United  Kingdom,  in  Ireland,  the  Netherlands, 
Belgium,  and  Switzerland  (204).  Approximately  2,350  physicians  participate  in  the 
network  using  hand-held  computers  to  transfer  information  to  the  coordinator. 

Establishment  of  a  computerized  European  sentinel-practice  network  is  a  long-term  goal 
of  the  Eurosentinel,  although  preliminary  findings  indicate  that  the  existing  networks 
are  quite  heterogeneous.  Nonetheless,  Eurosentinel  can  serve  as  a  clearinghouse  for  a 
wide  range  of  activities  that  highlight  similarities  and  differences  between 
countries — both  in  patterns  of  disease  and  in  the  practice  of  medicine  and  public 
health.   Eurosentinel  could  also  serve  as  a  model  for  a  broad-based  international 
consortium  of  sentinel  practice  networks. 
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REGISTRIES 

Overview 

The  use  of  registries  for  surveillance  and  other  medical  or  public  health  activities 
has  increased  in  recent  years,  largely  because  information  from  other  sources, 
including  notifiable  disease  reporting  mechanisms  and  vital  statistics,  is  often  not 
adequate  for  monitoring  the  public  health  impact  of  non-acute  diseases  (142) . 
Registries  differ  from  other  sources  of  surveillance  data  in  that  information  from 
multiple  sources  is  linked  for  each  individual  over  time.  Information  is  collected 
systematically  from  diverse  sources,  including  hospital-discharge  abstracts,  treatment 
records,  pathology  reports,  and  death  certificates.  Information  from  these  sources  is 
then  consolidated  for  each  individual  so  that  each  new  case  is  identified  and  cases 
are  not  counted  more  than  once.  Case  series  and  hospital-based  registries  in  which  the 
population  at  risk  is  not  known  can  be  useful  for  a  variety  of  activities,  including 
descriptive  analyses  and  assessment  of  treatment  effectiveness.   However,  population- 
based  registries  from  which  incidence  rates  can  be  calculated  are  generally  more 
useful.  Information  from  registries  is  used  primarily  for  research  purposes,  but  in 
many  instances,  registries  have  been  useful  for  surveillance  and  related  activities. 

The  most  successful  registries  are  those  where  purposes  are  explicit  and  realistic, 
the  data  collected  are  accurate  and  are  limited  to  essential  information,  and  the 
registry  meets  needs  that  cannot  be  accommodated  using  simpler,  less  expensive  methods 
(142, 143) .    Even  when  data  collection  appears  to  be  straightforward,  the  time  and 
resources  required  to  develop  a  functional  registry  are  often  underestimated.   Because 
high-quality  registries  are  resource  intensive  for  long  periods,  they  are  generally 
not  available  for  all  geographic  areas  or  exposed  groups.  Also,  the  complexity  of  the 
data-collection  process  limits  the  extent  to  which  data  can  be  made  available  rapidly. 


Registries  have  been  used  to  monitor  a  wide  range  of  health  events  and  have  identified 
opportunities  for  public  health  prevention  and  control  activities.  For  instance, 
analysis  of  data  from  one  of  the  earliest  registries--of  blind  persons  in  Great 
Britain-- found  that  blindness  among  substantial  proportion  of  the  elderly  was  due  to 
treatable  cataracts,  a  finding  that  had  not  been  previously  recognized  (142) .  Other 
health  events  that  have  been  monitored  using  registries  include  rheumatic  fever, 
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mental  illness,  Alzheimer's  disease  and  dementia,  renal  disease,  diabetes,  heart 
disease,  head  and  spinal  cord  injuries,  child  abuse,  early  childhood  impairments,  and 
occupation-related  diseases  such  as  berylliosis  {16,144-149) . 

Registries  are  also  used  to  monitor  health  events  in  groups  with  increased  exposure  to 
hazardous  agents,  including  radiation  and  hazardous  chemicals  found  in  the  work  place 
and  the  environment  (150-154) .   Cancer,  however,  is  by  far  the  most  common  condition 
for  which  registry  information  is  used  for  surveillance. 

Case  Series  and  Hospital-Based  Registries 

Case  series  and  hospital-based  registries  have  been  useful  for  surveillance-related 
activities  even  though  population-based  rates  usually  cannot  be  estimated.  Changes  in 
the  descriptive  epidemiology  of  berylliosis  have  been  monitored  using  a  registry,  for 
instance  (148,155)  .   Cases  of  berylliosis  increased  sharply  in  the  United  States  in 
1939  to  1941  following  an  increase  in  the  use  of  beryllium  in  large-scale  manufacture 
of  fluorescent  lamps  and  in  war  industries.  The  number  of  cases,  among  both  workers 
and  those  who  lived  near  production  facilities,  declined  rapidly  following  changes  in 
the  manufacturing  process  and  adoption  of  an  exposure  standard.  Case  registries  have 
also  been  used  to  study  relatively  rare  conditions  such  as  mesothelioma  among  those 
exposed  to  asbestos  and  adenocarcinoma  of  the  vagina  among  women  exposed  prenatally  to 
diethylstilbestrol  (156) . 

For  most  case  registries,  however,  the  primary  goal  is  to  provide  information  that  can 
be  used  to  improve  patient  care.  Registers  of  cancer  patients  are  maintained  by  many 
hospitals,  and,  more  recently,  some  hospitals  have  established  registries  of  persons 
who  have  been  treated  for  traumatic  events.  In  the  United  States,  hospital-based 
cancer  registries  have  been  promoted  by  the  American  College  of  Surgeons  since  1931 
and  have  been  required  as  part  of  their  cancer  program  since  1953  (156).    Standardized 
software  was  made  available  to  hospitals  beginning  in  the  1980s,  and  development  of  an 
electronic  data-transfer  standard  allowed  information  to  be  transmitted  centrally  from 
nearly  2,000  hospitals,  beginning  in  1990  (157).  The  newly  formed  National  Cancer  Data 
Base  of  the  American  College  of  Surgeons  includes  basic  information  on  about  20%  of 
all  cases  of  cancer  diagnosed  each  year  in  the  United  States.  By  highlighting  the 
importance  of  histologic  confirmation  prior  to  treatment,  hospital-based  cancer 
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registries  have  been  particularly  useful  in  improving  the  overall  quality  of  treatment 
for  cancer. 

More  recently,  development  of  regional  and  state  systems  for  trauma  care  have  prompted 
the  development  of  hospital-based  trauma  registries.  The  first  computerized  trauma 
registry  in  the  United  States  was  developed  in  1969  at  Cook  County  Hospital  in  Chicago 
and  was  expanded  to  a  statewide  registry  in  1971  that  included  information  from  50 
hospitals  designated  as  trauma  centers  in  the  state  {141,143, 157)  .   National  surveys  in 
1987  identified  105  hospitals  in  35  states  with  hospital-based  trauma  registries  and 
10  states  with  central  trauma  registries  (158) .   The  registries  differed  considerably, 
however,  in  the  criteria  used  for  inclusion  of  cases,  the  type  of  data  collected, 
coding  conventions,  and  the  manner  in  which  data  were  used.  In  an  effort  to  make 
information  in  hospital-based  trauma  registries  more  comparable,  standardized  case 
criteria  and  a  core  set  of  recommended  data  items,  along  with  supporting  computer 
software,  were  developed  by  CDC  and  others  in  1988  (.159)  .  Although  data  from  most 
existing  trauma  registries  are  not  population-based,  they  have  been  usee  to  support 
primary  prevention  activities.  For  instance,  findings  from  the  Virginia  Statewide 
Trauma  Registry  and  other  sources  were  used  to  support  legislation  regulating  the  use 
of  all-terrain  vehicles  (158)  . 

Population-Based  Registries 

Population-based  registries  are  particularly  useful  for  surveillance  because,  using 
incidence  rates,  the  occurrence  of  a  health  event  can  be  estimated  over  time  in 
different  geographic  areas  and  subgroups  of  the  population.  For  most  registries,  the 
population  from  which  cases  are  identified  is  the  general  population  of  a  specified 
area.  Most  cancer  and  birth  defects  registries,  for  instance,  estimate  rates  for  the 
general  population.  The  population  from  which  cases  are  identified  can  also  arise  from 
a  group  defined  by  a  specific  exposure  that  is  thought  to  increase  the  risk  of 
illness . 

Descriptive  analysis  of  incidence  rates  based  on  registry  information  can  be  used  for 
health  planning  purposes  and  can  suggest  etiologic  hypotheses  that  can  be  evaluated 
further  with  additional  studies  (50, 159-162)  .  For  some  conditions,  comparisons  between 
incidence  and  mortality  rates  can  be  used  to  estimate  the  effectiveness  of  primary 
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prevention,  early  detection,  or  treatment  programs.  Findings  from  studies  based  on 
registry  information  can  also  encourage  physicians  to  abandon  less-than-ef fective 
individual  therapies,  thus  improving  the  standard  of  medical  care. 

Exposure  Registries 

Examples  of  exposure-based  registries  include  the  survivors  of  atomic  bombing  or 
Hiroshima  and  Nagasaki  during  World  War  II  and  their  offspring  and  other  groups  of 
persons  exposed  to  radiation  {152,163-167) .   Because  workers  are  often  exposed  to 
higher  levels  of  physical,  chemical,  and  biologic  agents  for  longer  periods  than  is 
the  general  public,  follow  up  of  cohort  of  workers  have  been  used  for  many  years  to 
identify  illnesses  associated  with  these  agents  and  to  assess  how  these  illnesses  can 
be  prevented. 

Registries  have  also  been  been  used  to  assess  the  risk  of  illness  for  general 
population  groups  exposed  to  specific  agents.  For  instance,  about  4,600  individuals 
exposed  to  polybrominated  biphenyls  through  contamination  of  dairy  cattle-food 
supplements  in  Michigan  were  followed  to  assess  acute,  subacute,  and  chronic 
conditions  that  might  have  been  associated  with  this  exposure  (168).   More  recently, 
the  United  States  Congress  has  mandated  that  the  Agency  for  Toxic  Substances  and 
Disease  Registry  (ATSDR)  address  potential  public  health  problems  associated  with 
environmental  exposures  to  hazardous  waste  sites  and  chemical  spills,  partly  through 
the  creation  of  registries  (ISO) .  ATSDR  has  described  the  rationale  for  a  national 
exposure  registry  and  methods  to  be  used  in  its  establishment  and  maintenance. 

Cancer  Registries 

Cancer  registries  are  used  in  many  different  countries  to  estimate  cancer  incidence 
and  mortality  rates  over  time.  The  Connecticut  Tumor  Registry,  the  oldest  population- 
based  cancer  registry  in  the  United  States,  has  monitored  cancer  incidence  rates  for 
nearly  50  years  (156)  .   Like  hospital -based  registries,  the  Connecticut  registry  was 
developed  initially  to  support  the  goals  of  service-oriented  hospital-based  cancer 
registries  throughout  the  state.  Through  the  Surveillance,  Epidemiology,  and  End 
Results  (SEER)  program,  the  NCI  has  collected  information  from  specific  population- 
based  cancer  registries  since  1973.  Participant  registries  were  selected  to  include  a 
variety  of  population  groups  rather  than  a  representative  sample  of  United  States, 
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although  nation-wide  rates  can  be  estimated  using  SEER  data.  The  four  major  goals  of 
the  SEER  program  are: 

•  to  estimate  cancer-related  incidence  and  mortality  in  the  United  States; 

•  to  identify  unusual  changes  in  the  incidence  of  specific  types  of  cancer 
over  time  in  designated  areas  or  demographic  subgroups ,- 

•  to  describe  changes  in  the  extent  of  disease  at  diagnosis  and  to  estimate 
patient  survival;  and 

to  foster  studies  of  cancer  risk  factors,  screening,  and  prognostic 
factors  to  allow  intervention. 

The  SEER  registry  is  probably  the  largest  population-based  registry  in  the  Western 
world  (156).   Between  1973  and  1988,  the  program  registered  about  1.5  million  incident 
cases  of  cancer.  At  present,  about  10%  of  the  United  States  population  lives  in  one  of 
the  nine  areas  that  includes  a  SEER  registry,  and  approximately  120,000  new  cases  of 
cancer  are  registered  from  these  areas  each  year  (169).    For  all  types  of  cancer 
(except  certain  types  of  skin  cancer) ,  information  on  selected  patient  demographics  is 
recorded  in  addition  to  information  on  primary  site,  morphology,  confirmation  of 
diagnosis,  extent  of  disease,  and  first  course  of  treatment.  The  registries  also 
actively  follow  all  living  patients  to  ascertain  vital  status  (except  those  with  in 
situ  cervical  cancer) .  Incidence  rates  for  cancer  based  on  SEER  registry  information 
are  published  regularly,  and  descriptive  analyses  of  cancer  incidence  rates  by  age, 
race,  gender,  and  geographic  area  are  routinely  performed.  Although  not  part  of  the 
SEER  system,  many  states--including  New  York,  California,  and  New  Jersey --maintain 
active,  high-quality  cancer  registries  that  are  used  for  both  public  health  and 
hospital-directed  activities.  In  1989,  there  were  42  cancer  registries  in  the  United 
States,  including  28  state-based  registries  that  cover  part  or  all  of  a  state's 
residents  (170)  . 

In  Europe,  the  first  cancer  registry  was  founded  in  Denmark  in  1942,  and  there  has 
been  steady  growth  in  the  number  of  registries  and  the  size  of  included  populations 
since  then  (171).   At  present,  Denmark,  Belgium,  England  and  Wales,  and  Scotland  have 
nationwide  registries,  and  most  European  countries  have  registries  in  certain  regions. 
Information  from  cancer-incidence  registries  around  the  world  is  collected  by  the 
International  Agency  for  Research  on  Cancer  (IARC),  which  is  part  of  WHO.  As  of  1989, 
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IARC  had  identified  238  population-based  registries  in  53  countries  that  collected 
information  on  cancer  incidence,  and  rates  were  available  for  selected  years  from  106 
of  these  registries  (170)  . 

Registries  provide  important  information  for  a  wide  range  of  public  health  activities, 
but  their  usefulness  for  identifying  new  hazards  has,  in  practice,  been  limited. 
Initial  observations  by  astute  clinicians  rather  than  routine  analysis  of  surveillance 
data  have  led  to  more  extensive  studies  to  investigate  associations  between 
angiosarcoma  and  vinyl  chloride,  mesothelioma  and  asbestos,  and  diethylstilbestrol  and 
adenocarcinoma  of  the  vagina  (171).   Cancer  registries  were  essential,  however,  for 
identifying  cases  that  were  evaluated  in  more  extensive  epidemiologic  investigations. 
Today,  cancer  incidence  rates  from  population-based  registries  are  used  extensively  in 
cancer-cluster  investigations  to  assess  whether  the  number  of  observed  cases  differs 
substantially  from  an  expected  number  derived  from  baseline  cancer  incidence  rates. 
With  increased  emphasis  on  screening  activities  to  detect  asymptomatic  cancer  cases  at 
an  early,  more  treatable  stage  and  on  behavioral-risk-factor  control  and  possibly 
chemo-prevention,  the  public  health  importance  of  high-quality,  population-based 
cancer  registries  should  increase. 

Birth-Defects  Registries 

Recognition  of  an  epidemic  of  limb  reduction  defects  among  children  exposed  prenatally 
to  thalidomide  stimulated  interest  in  developing  population-based  birth-defects 
registries  in  many  countries.  Some  birth-  defects  surveillance  systems  (e.g.,  the 
Birth  Defects  Monitoring  Program  (BDMP)  in  the  United  States) ,  use  available  sources 
of  information  including  vital  statistics  and  hospital -discharge  data  to  monitor 
trends  in  the  birth  prevalence  of  various  birth  defects  {172).   This  type  of  passive 
monitoring  system  is  discussed  further  in  the  section  on  administrative  data  in  this 
chapter. 

Like  most  cancer- incidence  registries,  however,  birth  defects  registries  characterized 
by  active  case  finding  obtain  information  on  individual  cases  from  multiple  sources. 
In  the  United  States,  the  Metropolitan  Atlanta  Congenital  Defects  Program  (MACDP)  has 
been  in  operated  by  CDC's  National  Center  for  Environmental  Health  (NCEH)  (172-174) . 
All  births  are  monitored  in  the  five-county  metropolitan  Atlanta  area--  about  35,000 
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births  per  year.  Included  in  the  MACDP  are  all  live-born  and  stillborn  infants 
diagnosed  as  having  at  least  one  major  birth  defect  within  their  first  year  of  life, 
with  diagnoses  ascertained  within  their  first  5  years  of  life.  Birth-defect  rates  and 
trends  are  monitored  by  quarterly  reviews  and  analysis  of  data  and  are  published 
regularly  by  CDC.   Numerous  investigations  have  been  performed  using  MACDP  data, 
including  studies  of  Vietnam  veterans'  risk  for  fathering  children  with  birth  defects, 
the  risk  of  bearing  children  with  specific  birth  defects  for  women  with  insulin- 
dependent  diabetes,  and  an  apparent  protective  effect  of  peri -conceptual  vitamin  use 
on  the  risk  of  neural  tube  defects  (175-177).  In  addition,  the  MACDP  has  served  as  a 
prototype  for  other  birth-defects  registries  characterized  by  active  case- finding 
(172)  . 

Use  of  equivalent  case  definitions,  more  specific  coding  schemes,  and  a  uniform  set  of 
variables  has  facilitated  collaborative  efforts  between  the  eight  birth-defects 
registries  in  the  United  States  characterized  by  active  case-finding  (172).  For 
instance,  surveillance  for  specific  birth-defects  associated  with  first  trimester 
exposure  to  isotretinoin  relies  on  collaborative  efforts  by  CDC  and  state  birth- 
defects  registries. 

In  Europe,  population-based  birth-defects  registries  are  coordinated  through  EUROCAT, 
which  is  funded  through  the  Economic  Community  (178).  In  1983,  birth-defects  among 
250,000  births  were  monitored  by  17  birth-defects  registries  in  10  countries.  Both 
active  and  passive  birth-defects  registries  participate  in  the  International 
Clearinghouse  for  Birth  Defects  Monitoring  Systems  (ICBDMS) ,  founded  in  1974  by  WHO  as 
a  means  of  disseminating  birth-defects  data  from  surveillance  systems  around  the 
world.  Information  is  available  each  year  on  birth  defects  among  more  than  4.5  million 
births  in  30  countries.  Although  methods  used  by  various  registries  differ 
considerably,  the  ICBDMS  provides  a  forum  for  rapid  dissemination  of  information  on 
teratogens.  Reports  from  France  linking  valproic  acid,  an  anti-epileptic  drug,  with  an 
increase  in  spina  bifida  were  disseminated  rapidly  though  this  international  network 
(.179,180)  . 

More  recently,  some  registries  are  being  developed  in  some  local  communities  to 
monitor  preschool  children  for  whom  early  intervention  programs  are  needed.  These 
programs  can  identify  children  with  conditions  such  as  fetal  alcohol  syndrome, 
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cerebral  palsy,  mental  retardation,  and  behavioral  or  learning  disabilities  that  are 
often  detected  shortly  after  birth.  These  registries  will  be  useful  for  estimating  the 
prevalence  of  these  conditions,  as  well  as  for  monitoring  the  effectiveness  of 
services  provided  to  children  with  special  needs. 

SURVEYS 
Overview 

Health  surveys,  particularly  those  that  are  conducted  on  a  continual  or  a  periodic 
basis,  can  provide  useful  information  for  assessing  the  prevalence  of  health 
conditions  and  potential  risk  factors  and  for  monitoring  changes  in  prevalence  over 
time.  More  recently,  health  surveys  have  also  been  used  to  assess  knowledge, 
attitudes,  and  health  practices  in  relation  to  certain  conditions  such  as  HIV/AIDS.  A 
survey  differs  from  a  registry  in  that  persons  surveyed  are  usually  only  queried  once 
and  are  not  monitored  individually  after  that  one  contact.  Information  on  respondents 
can  be  obtained  through  questionnaires,  in-person  or  telephone  interviews,  or  through 
record  reviews.  Attempts  are  made  to  assure  that  the  survey  sample  is  as 
representative  of  the  source  population  as  possible  in  order  to  increase  the  validity 
and  reliability  of  estimates  extrapolated  to  that  population.  Surveys  are  can  be 
valuable  for  public  health  surveillance  if  similar  information  is  collected  over  time 
and  if  findings  are  applied  to  public  health  activities. 

In  the  United  States,  surveys  such  as  NCHS's  National  Health  Interview  Survey  (NHIS) 
are  important  sources  of  information  for  monitoring  nationwide  trends  in  the 
prevalence  of  target  conditions  and  risk  factors  for  which  national  health  objectives 
for  the  year  2000  have  been  established  (73,181).   Nationwide  surveys  are  costly, 
however,  and  due  to  their  complex  sample  designs,  specialized  statistical  techniques 
are  often  needed  for  analysis.  Since  information  is  usually  not  available  at  a  local 
level,  the  usefulness  of  national  surveys  for  local  surveillance  activities  is 
limited. 

Health  Interview  Surveys 

In  the  United  States,  the  NHIS,  conducted  annually  since  1957,  provides  information  on 
self-reported  illnesses,  chronic  conditions,  injuries,  impairments,  the  use  of  health 
services,  and  other  health-related  topics  for  the  civilian,  non-institutionalized 


70 

population  {182, 183) .   Households  are  identified  through  a  complex  sample  design 
involving  both  clustering  and  stratification.  Households  selected  for  interview  each 
week  are  a  probability  sample  from  a  primary  sampling  unit  such  as  a  county  or 
metropolitan  area.  Respondents  are  interviewed  in  their  homes  with  an  adult  family 
member  providing  information  for  other  members  of  the  household.  Each  year, 
information  is  collected  on  about  122,000  people  from  about  48,500  households  (2).  The 
interviews,  which  average  about  80  minutes,  include  a  core  set  of  health  and  socio- 
demographic  questions  are  repeated  each  year  and  a  supplemental  section  in  which 
detailed  information  is  collected  on  specific  health  topics.  In  1987,  for  instance, 
supplemental  information  was  collected  on  risk  factors  for  cancer  and  nn  knowledge  and 
attitudes  regarding  AIDS.  NHIS  questions  will  be  modified  in  the  future  so  that 
progress  toward  meeting  the  year  2000  health  objectives  for  the  nation  can  be 
monitored  closely. 

In  England,  Scotland,  and  Wales,  the  General  Household  Survey  (GHS)  in  which 
information  on  housing,  employment,  education,  health,  and  use  of  social  services  is 
obtained  using  structured  personal  interviews  has  been  in  operation  since  1971  (2)  .   An 
analogous  Continuous  Household  Survey  is  conducted  in  Northern  Ireland.  Electoral 
wards  form  the  primary  sampling  units,  and  about  85%  of  households- -a  total  of  about 
12,000  per  year--agree  to  participate  in  the  GHS.  Over  time,  the  health  section  of  the 
survey  has  included  questions  on  limitations  in  activities  because  of  acute  or  chronic 
illnesses,  smoking  and  drinking  patterns,  and  contacts  with  health-care  providers  and 
other  health-related  topics.  The  ability  to  compare  health-related  information  with 
extensive  socio-demographic  information  is  one  of  the  major  strengths  of  these 
surveys . 

In  the  United  States,  CDC's  National  Center  for  Chronic  Disease  Prevention  and  Health 
Promotion  (NCCDPHP)  has  worked  with  state  health  departments  since  1981  to  conduct 
telephone  surveys  about  adult  health  behavior  and  use  of  prevention  services.  The 
primary  purpose  of  these  surveys  is  to  support  state  prevention  initiatives. 
Questionnaires  used  by  the  Behavioral  Risk  Factor  Surveillance  System  (BRFSS)  include 
a  core  set  of  questions,  and,  depending  on  a  state's  interest,  supplemental  questions 
developed  by  CDC  and  questions  that  meet  state-specific  needs  {184)  .   The  1988  BRFSS 
included  questions  on  height,  weight,  physical  activity,  smoking,  alcohol  use,  seat- 
belt  use,  and  use  of  prevention  services,  such  as  cholesterol  screening  and 
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mammography.  By  1990,  45  states  and  the  District  of  Columbia  were  conducting  these 
surveys.  Some  states  have  used  BRFSS  procedures  to  conduct  more  detailed  studies.  In 
Missouri,  for  instance,  cholesterol  awareness  was  compared  in  urban  and  rural  areas 
was  compared,  and  in  California,  cigarette  smoking  was  compared  among  Chinese, 
Vietnamese,  and  Hispanics  in  three  communities  (185,186)  .    Information  from  the  BRFSS 
is  timely  and  can  reflect  the  particular  interests  of  a  state  or  local  community.  Use 
of  telephones  for  interviewing  is  economical,  although  many  persons  without  telephones 
who  are  not  included  in  these  surveys  are  generally  more  likely  to  be  in  need  of 
public  health  services  than  many  of  the  respondents. 

Since  1988,  NCCDPHP  has  developed  and  implemented  a  Youth  Risk  Behavior  Survey  (YRBS) 
to  focus  the  efforts  of  local,  state,  and  federal  agencies  that  monitor  the  behavior 
of  young  people  (187) .  In  1990,  the  national  survey  used  a  three-stage  sample  design 
to  obtain  a  probability  sample  of  11,631  students  in  grades  9  through  12  in  50  states, 
the  District  of  Columbia,  Puerto  Rico,  and  the  Virgin  Islands.  From  the  1990  survey, 
estimates  are  available  for  the  prevalence  of  tobacco  use,  alcohol  and  drug  use, 
exercise,  diet,  types  of  behavior  that  affect  the  risk  of  intentional  and 
unintentional  injuries,  and  sexual  activity  {188-194) .   The  YRBS  was  designed  to 
monitor  changes  in  these  types  of  behaviors  biennially  so  that  progress  toward  meeting 
year  2000  objectives  can  be  monitored. 

Provider-Based  Surveys 

In  the  United  States,  information  on  the  use  of  health-care  services  is  not  available 
routinely.  In  order  to  estimate  the  use  of  these  services  nationally,  NCHS  has 
developed  two  complementary  surveys,  the  National  Hospital  Discharge  Survey  (NHDS)  and 
the  National  Ambulatory  Medical  Care  Survey  (NAMCS) ,  in  which  characteristics  of 
health  encounters  are  monitored  (181,    195,196).   Through  the  NHDS,  information  has  been 
collected  since  1965  on  discharges  from  non-federal,  short-stay  hospitals,  including 
characteristics  of  patients,  length  of  stay,  diagnoses,  surgical  procedures,  and 
hospital  size  and  type  of  ownership.  Beginning  in  1987,  computerized  information  for 
some  discharges  was  purchased  from  commercial  abstracting  services,  but,  otherwise, 
discharges  are  sampled  randomly  from  hospitals  included  in  the  survey.  In  1987, 
information  was  collected  on  about  181,000  discharges  from  about  400  hospitals- -about 
81%  of  the  hospitals  that  were  asked  to  participate.  Although  hospital-discharge 
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information  is  available  in  many  states,  it  is  not  available  nationally,  so  that  state 
estimates  are  often  derived  by  extrapolation  from  the  NHDS.  Data  from  the  NHDS  as  well 
as  other  sources  have  been  used,  for  instance,  to  assess  the  public  health  burden  of 
nine  major  chronic  diseases  (197)  . 

The  NAMCS  has  been  conducted  annually  from  1973  to  1981,  in  1985,  and  annually  since 
1989.  The  target  population  for  the  NAMCS  is  office  visits  within  the  continental 
United  States  to  non- federal  physicians  who  are  in  office-based  practice  and  engaged 
in  direct  patient  care  (9 ,181,196) .   About  70%  of  all  ambulatory  visits  occur  in 
physicians'  offices,  and  about  70%  of  selected  physicians  agreed  to  participate  in  the 
survey  in  1990.  Beginning  in  1989,  about  2,500  physicians  were  included  in  the  sample, 
with  each  physician  completing  a  short  form  for  about  30  office  visits.  Information  on 
visits  to  hospital  out-patient  departments  and  emergency  rooms  may  be  added  to  the 
NAMCS  in  the  future.  In  addition  to  information  on  diagnoses,  medications,  and  reason 
for  visit,  the  1990  NAMCS  included  information  on  diagnostic  and  screening  services; 
counseling  for  drug,  alcohol,  and  smoking  cessation;  and  other  counseling  services 
(198).   Estimates  are  published  at  the  national  level,  and  for  some  events,  at  the 
regional  level.  Unlike  hospital-discharge  data,  ambulatory-  care  data  are  rarely 
available  for  routine  use  at  the  state  or  local  level  in  the  United  States.  To  obtain 
information  that  could  be  used  in  their  programs,  however,  Wisconsin  conducted  an 
ambulatory  medical  care  survey  in  1986-1987  based  on  the  NAMCS  questionnaire  and  study 
design  (199)  .    Proprietary  data  bases,  such  as  the  National  Disease  and  Therapeutic 
Index  (NDTI)  provide  ongoing  data  on  conditions  seen  in  ambulatory  care  settings. 
Although  used  primarily  by  the  pharmaceutical  industry,  the  NDTI  has  been  used  monitor 
the  public  health  impact  of  recommendations  to  limit  the  use  of  aspirin  in  children 
with  fevers  (200)  . 

Other  Surveys 

Other  NCHS  surveys  include  the  National  Survey  of  Family  Growth  (NSFG)  and  the 
National  Health  and  Nutrition  Examination  Survey  (NHANES)  also  contain  information 
that  is  useful  for  public  health  activities.  The  NSFG  has  provided  national  data  on 
demographic  and  social  factors  associated  with  childbearing,  adoption,  and  maternal 
and  child  health  based  on  household  interviews  of  women  of  childbearing  age.  The 
survey  has  been  conducted  four  times--in  1973,1976,1982,  and  1988  (201-203). 
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The  NHANES  has  provided  extensive  information  on  the  prevalence  of  chronic  conditions, 
distribution  of  physiologic  and  anthropomorphic  measures,  and  nutritional  status  for 
representative  samples  of  the  U.S.  population  (204,205).  The  first  two  NHANES  cycles 
were  conducted  in  1971  through  1974  and  1976  through  1980  and  data  collection  is 
currently  under  way  for  the  third  cycle.  A  Hispanic  Health  and  Nutrition  Examination 
Survey  was  conducted  in  1982  through  1984  in  order  to  compare  health  and  nutritional 
measures  among  U.S.  residents  of  Mexican,  Puerto  Rican,  and  Cuban  origin  (206)  .    Also, 
almost  4000  persons  ages  55  to  74  years  of  ages  who  had  been  interviewed  in  NHANES  I 
and  were  living  in  1984  were  enrolled  in  the  NHANES  I  Follow-up  Study  to  assess 
whether  their  characteristics  in  the  1970s  predicted  subsequent  health  outcomes  (207) . 
The  NHANES  studies  are  rich  sources  of  information  that  are  used  primarily  for 
epidemiologic  and  related  analyses.  They  have  been  used,  however,  to  provide  point 
estimates  to  monitor  changes  over  time  in  health  outcomes,  such  as  changes  in  blood- 
lead  levels  (208).    In  general,  sources  of  information  that  are  available  for  more  of 
the  population  over  longer  periods  are  more  useful  for  routine 
surveillance  activities. 

ADMINISTRATIVE    DATA-COLLECTION    SYSTEMS 
Overview 

Through  the  use  of  standard  procedures  and  classification  schemes,  vital  statistics 
are  derived  from  birth  and  death  certificates,  completed  primarily  for  legal  reasons. 
Likewise,  information  on  conditions  not  evident  at  the  time  of  birth  or  death  can  be 
derived  from  administrative  information  routinely  available  on  episodes  of  care 
(including  hospitalizations,  visits  to  emergency  rooms,  and  visits  to  health-care 
providers  in  the  community).   In  most  instances,  routinely  collected  administrative 
data  have  been  computerized  for  billing  purposes,  but  since  diagnoses  are  often 
included,  these  data  sets  can  provide  useful  information  for  public  health 
surveillance.   As  computerized  administrative  data  become  increasingly  available, 
their  importance  for  monitoring  a  wide  range  of  health  outcomes  is  increasing. 

Availability  and  usefulness  of  administrative  data  for  surveillance  depend  on  a  number 
of  factors  including: 

•      the  type  of  information  that  is  computerized; 
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•      the  extent  to  which  uniform  classification  schemes  are  used  to  categorize 
diagnoses,  signs,  symptoms,  procedures,  and  reasons  for  seeking  health 
care; 

the  availability  of  sufficient  computer  capacity  and  user-friendly 
software  programs  to  process  large  amounts  of  data; 
the  extent  to  which  supplementary  information  can  be  obtained;   and 
the  extent  to  which  information  for  individuals  from  different 
administrative  sources  or  time  periods  can  be  linked  using  a  unique 
personal  identifier; 

Data  that  include  personal  identifiers  are  particularly  useful  both  because  statistics 
can  be  calculated  on  the  basis  of  persons  rather  than  on  episodes  of  care  and  because 
additional  information  can  often  be  obtained  through  linkage  with  other  data  sets. 
Special  precautions  are  needed,  however,  to  protect  the  confidentiality  of  individuals 
when  personal  identifiers  are  included  in  computerized  administrative  data  bases. 
Even  when  personal  identifiers  are  not  included,  administrative  data  can  be  very 
useful,  however,  for  assessing  the  public  health  burden  of  various  conditions  based  on 
the  number  of  health-care  visits  and  their  costs. 

Integrated  health-information  systems  based  on  administrative  data  are  available  in  a 
few  countries,  but  in  most,  information  may  be  available  only  for  certain  types  of 
health  care  (e.g.,  hospitalizations)  or  for  certain  segments  of  the  population  (e.g., 
those  who  receive  care  through  the  public  sector) .  Although  usually  incomplete, 
analysis  of  administrative  data  has  proved  useful  for  public  health  surveillance  and 
program  planning. 

Integrated  Health  Information  Systems 

Integrated  health- information  systems,  in  which  data  on  individuals  are  consolidated 
from  a  variety  of  sources  are  available  in  Sweden,  Canada,  and  for  limited  groups  in 
the  United  States.   In  Sweden,  for  instance,  use  of  a  unique  personal  identifier 
assigned  at  birth  allows  the  linkage  of  computerized  information  on  individuals  from  a 
variety  of  sources,  including  birth  and  death  certificates,  the  cancer  registry, 
hospital  discharge  summaries,  and  prescription  records  {209) .      In  addition  to 
etiologic  studies,  linked  Swedish  data  bases  have  been  used  for  a  variety  of 
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surveillance-related  analyses.   Examples  include  estimating  the  incidence  of  acute 
myocardial  infarction;  comparing  methods  of  ascertaining  myocardial  infarction  using 
community  registers,  hospital  discharge  data,  and  mortality  data;  and  assessing 
temporal  trends  in  the  incidence  of  hip  fracture  (144,146,210). 

In  Canada,  the  Saskatchewan  Health  Plan  maintains  population-based  billing  information 
including  diagnoses  from  inpatient,  outpatient,  and  prescription  records  for 
approximately  1  million  residents  beginning  in  1979  (211,212) .     This  information, 
which  has  been  used  in  studies  of  associations  between  nonsteroidal  anti- inflammatory 
drugs  and  fatal  gastrointestinal  bleeding  and  of  associations  between  valproic  acid 
use  and  congenital  malformations,  could  also  be  used  for  ongoing  surveillance 
activities  (213,214) . 

In  the  United  States,  integrated  health- information  systems  have  been  developed  for 
some  health-maintenance  organizations  such  as  the  Kaiser  Permanente  system  or  for 
geographic  areas  served  by  one  major  health  care  provider- -such  as  Rochester, 
Minnesota.  Although  used  frequently  for  research,  the  few  integrated  health- 
information  systems  in  the  United  States  are  of  limited  use  for  general  public  health 
surveillance  because  the  populations  included  in  them  are  relatively  small  and  not 
representative  of  the  U.S.  population.  These  systems  are  useful,  however,  for 
providing  information  on  incidence  and  prevalence  for  conditions  difficult  to  monitor 
nationally- -such  as  the  trends  in  incidence  for  specific  types  of  primary  intracranial 
neoplasms  (225)  and  the  prevalence  of  osteoarthritis  of  the  knee  with  and  without 
corroborative  radiographic  findings  (216) . 

Hospital-Discharge  Data  Systems 
Overview 

The  importance  of  collecting  information  on  morbidity  from  hospital  records  was  noted 
by  Florence  Nightingale  among  others,  although  attempts  to  collect  and  analyze  this 
information  systematically  were  not  initiated  until  the  1940s  in  Scotland  (2,22  7). 
Today,  computerized  information  from  hospital  discharge  summaries-- including 
demographic  information  and  discharge  diagnoses-- is  routinely  collected  and 
computerized  using  standard  data-set  formats  such  as  the  1981  Recommended  Minimum 
Basic  Data  Set  (RMBDS)  for  the  European  community  and  the  Uniform  Hospital  Discharge 
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Data  Set  (UHDDS)  or  the  Medicare  Uniform  Bill-82  (UB-82)  formats  in  the  United  States 
(218,219)  .      Both  the  UHDDS  and  the  UB-82  formats  are  currently  being  revised  in 
tandem. 

In  Scotland,  for  example,  a  standard  morbidity  record  form  is  completed  for  each 
admission  to  a  general,  psychiatric,  or  maternity  hospital  and  is  sent  to  a  central 
agency  for  processing  and  statistical  analysis  (217)  .      Initiated  in  parts  of  Scotland 
in  1951,  the  system  eventually  included  the  entire  country  by  1961.  Although  records 
include  a  unique  personal  identifier,  they  are  not  linked  routinely  except  in  one  area 
of  the  country.  With  the  advent  of  the  National  Health  System  in  1948,  a  similar 
system  based  on  10%  of  hospital  admissions  was  initiated  in  England  and  Wales  that 
covered  all  areas  by  1958. 

To  monitor  the  quality  of  care  provided  in  U.S.  hospitals,  each  acute-care  hospital  is 
required  by  the  Joint  Commission  on  Accreditation  of  Healthcare  Organizations  to 
report  information  on  diagnoses,  length  of  stay,  and  inpatient  services.   Hospitals 
often  contract  with  private  companies  to  abstract  and  computerize  pertinent  data  from 
medical  records,  but  in  recent  years,  many  hospitals  are  computerizing  this 
information  themselves  or  abstracting  it  from  computerized  treatment  records. 
Beginning  in  the  early  1980s,  individual  states  began  to  require  submission  of 
hospital-discharge  data  for  utilization,  financial,  and  other  health-planning  studies 
(219) .      Thus,  hospital  discharge  summary  data  are  computerized  for  most  discharges 
from  acute-care  hospitals  in  the  United  States,  but  data  are  not  available  nationally 
for  all  segments  of  the  population  from  any  one  source. 

Private -sector  systems 

In  the  private  sector,  the  Commission  on  Professional  and  Hospital  Activities  (CPHA) 
has  abstracted  information  from  medical  records  of  U.S.  hospitals  for  over  30  years 
(219,220) .      Today,  CPHA's  Professional  Activities  Study  (PAS)  data  base  includes  over 
200  million  records  with  diagnoses  coded  according  to  the  clinical  modification  of  the 
ICD-9    (ICD-9-CM)    in  the  UHDDS  format;  6  million  more  records  are  being  added  each  year 
(219,221)  .   The  PAS  includes  information  from  clinical  rather  than  billing  records, 
since  staff  from  cooperating  hospitals  review  medical  charts,  prepare  case  abstracts, 
and  send  information  to  CPHA.   Hospital-discharge  data  from  CPHA  and  more  recently 
from  the  McDonnell  Douglas  Hospital  Information  System  (MDHIS)  have  been  used  for  the 
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surveillance  of  birth  defects  and  related  conditions  (222).      Today,  the  Birth  Defects 
Monitoring  Program  (BDMP) ,  initiated  in  1974,  includes  information  from  newborn 
discharge  summaries  for  about  1  million  newborns  per  year--about  25%  of  the  births  in 
the  United  States.   Prevalence  rates  are  calculated  using  the  number  of  live  births  as 
the  denominator,  and  trends  in  rates  for  targeted  conditions  are  published  routinely 
(223)  .      Information  for  the  BDMP  is  abstracted  from  hospital  discharge  summaries  and 
is  not  routinely  verified.  Although  personal  identifiers  are  not  included  in  BDMP 
data  sets,  participating  hospitals  have  agreed  to  provide  hospital  records  for  special 
studies  using  their  own  patient  numbers  to  identify  records  (224,225)  .      More  recently, 
additional  information  on  possible  maternal  exposures  (e.g.,  infections,  use  of 
prescription  or  illicit  drugs,  or  the  use  of  alcohol)  linked  to  birth  defects  or  other 
adverse  outcomes  noted  at  birth  is  available  for  a  subset  of  infants  in  the  BDMP. 
Probabilistic  matching  procedures  are  used  to  link  summary  data  without  personal 
identifiers  from  newborn  and  maternal  hospital  discharge  records  (222)  .      Validation 
studies  indicate  that  about  95%  of  the  records  linked  using  the  matching  algorithm  are 
true  matches.   Linked  maternal  and  infant  hospital-discharge  records  are  particularly 
useful  for  investigating  problems  associated  with  maternal  exposures.   Information  on 
birth  defects  surveillance  systems  characterized  by  active  case-finding  and 
integration  of  information  from  multiple  sources  appears  in  the  registry  section  of 
this  chapter. 

In  the  United  States,  use  of  hospital-discharge  data  from  CPHA,  MDHIS,  or  other 
private-sector  sources  is  more  limited  for  surveillance  of  conditions  other  than  those 
identified  at  birth.  For  the  latter,  birth-prevalence  rates  can  be  calculated  using 
the  number  of  live  births  in  that  hospital  as  the  population  at  risk,  even  if  the 
geographic  areas  to  which  these  rates  apply  are  not  known.  Calculation  of  incidence 
or  prevalence  rates  for  other  conditions  is  limited  by  two  factors:  first,  because  the 
lack  of  complete  coverage  for  a  geographic  area  limits  the  use  of  census  data  to 
estimate  the  population  at  risk;  and,  second,  because  initial  hospitalizations  for 
conditions  cannot  usually  be  distinguished  from  subsequent  hospitalizations. 

In  1988,  29  states  maintained  hospital-discharge-data  systems  for  acute-care 
hospitals:  17  in  the  UB-82  format,  eight  in  the  UHDDS  format,  and  four  in  unique  data 
formats  (219).     Although  not  currently  required  on  the  UHDDS  or  the  UB-82,  external 
cause-of-injury  ("E  codes")  are  required  in  eight  states  (226).      In  most  states, 
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unique  personal  identifiers  are  not  computerized,  and  the  extent  to  which  these  data 
can  be  accessed  and  used  for  surveillance  varies  from  state  to  state.   When  hospital 
discharge  information  is  available,  however,  estimates  of  the  public  health  burden  of 
inpatient  care--based  on  the  number,  the  duration,  and  the  cost  of  hospitalizations — 
have  been  useful  for  setting  priorities  for  prevention  or  treatment  efforts  or  for 
targeting  interventions  to  specific  subgroups  in  the  community. 

In  California,  for  instance,  hospital -discharge  data  coupled  with  estimates  of  the 
proportion  of  specific  diseases  attributable  to  smoking  were  used  to  estimate  the  cost 
of  treating  smoking-related  diseases  paid  with  public  funds.  To  recoup  some  of  these 
costs,  California  instituted  a  25-cent  sales  tax  on  tobacco  products  in  1989  (227). 
State-based  hospital  discharge  data  systems  have  also  been  used  effectively  to  assess 
the  public  health  impact  of  injuries  in  states  that  require  "E  codes"  (226)  .      For 
instance,  the  effect  of  mandatory  seat-belt  laws  and  more  stringent  drunk-driving  laws 
on  motor-vehicle-related  injuries  has  been  demonstrated  using  hospital-discharge  data 
that  includes  "E  codes'. 

Federal  data-collection  systems 

In  the  United  States,  health  care  is  provided  using  public  funds  for  about  one-quarter 
of  the  non- institutionalized  population--including  the  elderly  (13%),  the  poor  (9%), 
and  the  military  and  their  dependents  (4%)  [228).      In  1965,  two  federal  health- 
insurance  programs- -a  hospital  insurance  plan  and  a  supplementary  insurance  plan- -were 
established  for  persons  _>  age  65.   Both  of  these  Medicare  health- insurance  programs 
are  administered  by  HCFA.  All  eligible  recipients  are  enrolled  in  the  first  plan 
(Part  A),  which  provides  coverage  for  inpatient  hospitalizations,  stays  in  skilled 
nursing  facilities,  and  home  health  services.  The  second  plan  (Part  B) ,  for  which 
beneficiaries  pay  a  small  premium,  covers  physician  services,  outpatient  hospital 
services,  and  other  medical  services.  About  96%  of  the  population  _>  65  years  is 
enrolled  in  at  least  the  Part  A  program  (229)  .     Medicare  programs  were  extended  in 
1972  to  cover  persons  with  end-stage  renal  disease  that  required  dialysis  or 
transplantation  and  to  persons  with  disabilities  <65  years  (230) .      In  Fiscal  Year 
1988,  Medicare  program  payments  for  31  million  beneficiaries  _>  65  years  and  an 
additional  3  million  persons  with  disabilities  accounted  for  about  18%  of  all  personal 
health- care  spending  in  the  United  States. 
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For  Part  A  claims,  computerized  bills  in  the  UB-82  format  are  submitted  to  fiscal 
intermediaries  and  then  are  consolidated  nationally.   Diagnoses  included  on  each  bill 
affect  payment  to  hospitals  because,  since  1983,  most  short-stay  hospitals  have  been 
paid  for  each  case  on  the  basis  of  prospectively  established  rates  for  some  475 
diagnosis-related  groups  (DRGs)  (228)  .     To  monitor  the  quality  of  care  provided 
through  Medicare  programs,  HCFA  created  the  Medicare  Provider  Analysis  and  Review 
(MEDPAR)  file  by  linking  information  on  individuals  such  as  age,  gender,  race,  and 
residence  from  the  eligibility  files;  information  on  diagnoses  and  treatment  from  Part 
A  and  Part  B  claims  files;  and  information  on  health-care  providers  from  a  facilities 
file.   A  unique  health- insurance  number — usually  the  social  security  number--is  used 
to  link  information  on  individuals.  HCFA  has  created  a  public-use  file  for  Part  A 
data  from  the  MEDPAR  file  and  plans  to  add  Part  B  files,  which  will  includes 
diagnostic  data  in  1992  {231) . 

Although  most  studies  using  MEDPAR  files  have  focused  on  quality  of  care  and  medical 
effectiveness  these  files  have  also  been  used  to  assess  the  public  health  impact  of 
various  conditions  such  as  end- stage  renal  disease  and  hip  fracture  among  the  elderly 
(107,230-234)      Point  prevalence  can  be  estimated  because  nearly  all  members  of  the 
general  population  _>  65  years  are  enrolled  in  Medicare.   Incidence  can  also  be 
estimated  for  some  conditions  because  the  first  hospitalization  can  be  identified  in 
records  for  an  individual  linked  by  using  the  unique  personal  identifier.   These 
estimated  incidence  rates  would  approximate  true  incidence  rates  more  closely, 
however,  for  acute  events  such  as  hip  fracture  than  for  long-standing  conditions  such 
as  Type  II  diabetes.  Since  many  conditions  are  commonly  among  the  elderly,  rates  can 
often  be  estimated  for  small  geographic  areas  such  as  cities  or  counties  (235) . 
Recent  studies  indicate,  for  instance,  that  hip  fracture  is  more  common  in  southern 
states,  even  though  weather  conditions  are  more  adverse  in  the  north  (236,237) . 

Even  more  useful  public  health  surveillance  information  about  Medicare  recipients 
should  be  more  available  in  the  near  future.  A  National  Claims  History  File  is  being 
created  for  elderly  Medicare  recipients  with  information  from  all  claims  linked  for 
individuals  (219)  .      To  obtain  additional  clinical  information,  medical  records  for  a 
random  sample  of  beneficiaries  will  be  abstracted  using  standard  procedures  to  create 
a  Uniform  Clinical  Data  Set.  Self-administered  questionnaires  will  be  sent  to  a 
sample  of  the  elderly  at  regular  intervals  to  obtain  additional  information  on  health 
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status  prior  to  entering  the  Medicare  program,  on  health- related  behaviors,  and  on 
functional  status.   Information  from  all  these  sources  will  be  linked  in  the  Medicare 
Beneficiary  Health  Status  Registry.  Information  from  other  sources,  such  as  the  SEER 
registry  and  other  cancer-incidence  registries  will  be  linked  with  Medicare  files  when 
possible  (238)  .     An  end-stage  renal  disease  registry  has  been  developed  by  linking 
health-claims  information  (239) .     As  they  become  available,  these  enhanced  data  sets 
should  prove  useful  for  monitoring  trends,  for  public  health  planning,  and  for 
evaluating  the  effectiveness  of  medical  and  preventive  health  services  such  as 
mammography  and  vaccination. 

Medicaid,  HCFA's  other  major  public  health-care  program,  provides  health-care  funds 
for  the  poor  and  medically  needy  through  a  federal-state  cost-sharing  program. 
Medicaid  data  had  been  used  in  for  surveillance  and  program  planning  at  state  and 
local  levels,  particularly  in  the  maternal  and  child  health  area.   Further  information 
on  uses  of  Medicaid  claims  data  for  surveillance  is  provided  in  the  ambulatory  care 
and  related  data  section  of  this  chapter. 

Hospital-discharge  records  from  IHS  hospitals  have  been  particularly  useful  for 
developing  community-specific  injury  profiles  and  targeting  local  public  health 
interventions  (226)  .      "E  codes"  have  been  included  in  discharge  summaries  from  IHS 
hospitals  for  over  20  years,  and  regional  injury  prevention  coordinators  are  notified 
electronically  of  injury-related  hospitalizations.   Identification  of  hazardous  areas 
identified  through  analysis  of  local  data  has  led  to  brighter  and  more  effective 
lighting  and  to  installation  of  pedestrian  walkways  along  hazardous  stretches  of  road. 

Data-Collection  Systems  in  Emergency  Rooms  and  Other  Units 

Administrative  data  from  hospital  emergency  rooms  have  been  used  for  surveillance  of  a 
variety  of  acute  health  events  including  non- fatal  injuries,  illicit  drug  use, 
poisonings,  and  adverse  reactions  to  prescription  drugs.  Unlike  inpatient  hospital- 
discharge  data,  however,  emergency- room  data  are  not  routinely  computerized  and 
reported  from  all  hospitals  in  a  standard  format.   Because  the  type  of  information 
recorded  and  the  filing  systems  used  to  retrieve  health  information  differ,  special 
surveillance  systems  focused  on  specific  outcomes  such  as  injuries  or  illicit  drug  use 
have  been  developed  using  information  obtained  from  cooperating  hospitals. 
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Information  from  these  special  surveillance  systems  is  usually  not  linked  with  other 
data  sources.  Although  the  scope  of  these  systems  is  limited,  they  have  provided 
useful  information  for  the  surveillance  of  acute,  non-fatal  health  events  for  which 
admission  to  a  hospital  is  not  warranted. 

In  England  and  Wales,  information  has  been  provided  by  the  Home  Accident  Surveillance 
System  (HASS)  since  1976  [240)  .      Information  is  collected  by  trained  clerks  from  20 
randomly  sampled  major  emergency  departments.  Each  hospital  remains  in  the  system  for 
4  years,  and  five  hospitals  are  replaced  each  year  from  the  pool  of  270  hospitals  with 
large  emergency  departments.  A  similar  system,  the  European  Home  and  Leisure 
Accident  Surveillance  System  (EHLASS)  is  being  implemented  in  all  European  Economic 
Community  countries. 

In  the  United  States,  information  on  injuries  associated  with  the  use  of  consumer 
products  (other  than  automobiles)  is  available  through  CPSC's  National  Electronic 
Injury  Surveillance  System  (NEISS) .   Since  1972,  information  on  consumer-product- 
related  injuries,  poisonings,  and  burns  has  been  abstracted  from  emergency -room 
records  of  a  representative  sample  of  hospitals  (9) .  Information  is  sent 
electronically  each  day  to  CPSC,  and  more  in-depth  information  can  be  obtained  on 
conditions  of  special  interest.   Information  on  occupation- related  injuries  has  been 
collected  since  1982,  although  the  number  of  hospitals  included  in  NEISS  was  reduced 
from  the  original  73  to  62  in  1987  (241,242). 

National  estimates  for  a  variety  of  conditions  are  derived  by  weighing  data  from 
reporting  hospitals.  NEISS  has  provided  estimates  of  various  consumer-product-  and 
occupation-related  injuries,  including  estimates  of  the  number  of  work-related 
injuries  in  the  United  States  bicycle-related  injuries  and  poisonings  among  children 
(241-243)  .     NEISS  provides  the  only  national  estimates  of  injuries  seen  in  emergency 
rooms,  although  the  number  of  hospital  emergency  rooms  on  which  this  information  is 
based  is  relatively  small.  NEISS  data  have  also  been  used  to  assess  the  public  health 
impact  of  injuries  at  the  local  level.  From  NEISS  data  from  one  hospital,  a  cluster 
of  injuries  that  occurred  among  young  girls  and  were  related  to  playground  merry-go- 
rounds  was  identified  (244)  .    Pediatric  injury  surveillance  systems  using  emergency 
room  and  hospital  discharge  data  have  also  been  established  in  other  areas  (245,246)  . 
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In  the  United  States,  NIDA's  Drug  Abuse  Warning  Network  (DAWN)  relies  on  reports  from 
about  700  hospital  emergency  rooms  and  85  medical -examiners'  or  coroners'  offices  to 
detect  emerging  trends  in  the  nature  and  severity  of  drug-abuse  problems  in  the  United 
States  (9,247).      Facilities  report  voluntarily  to  DAWN  beginning  in  1972,  about  453 
emergency  rooms  in  21  U.S.  cities  reported  data  consistently  by  1991  (248).     Cocaine- 
related  deaths  increased  rapidly  between  1985  and  1988  although  recent  reports 
suggest  that  cocaine-related  medical  emergencies  began  to  decrease  in  the  first  half 
of  1989.  In  the  same  metropolitan  areas,  about  twice  as  many  deaths  were  identified 
through  DAWN  as  through  the  vital  statistics  system,  although  time  trends  were  similar 
in  both  types  of  data.  The  DAWN  system  provides  timely  information  on  medical 
emergencies  related  to  drug  abuse,  although  estimates  are  not  population-based  and  are 
based  on  voluntary  participation  from  medical  facilities. 

In  some  areas,  information  may  be  available  from  poison-control  centers,  burn  units, 
or  trauma  registries.   In  Great  Britain,  poison-control  centers—particularly  the 
National  Poison  Information  Service  in  London--have  provided  information  for  a  variety 
of  studies  of  trends  in  abuse  of  solvents  and  poisonings  of  children  (249) .      In  the 
United  States,  poison-control  centers--covering  430  defined  geographic  areas--reported 
over  121,000  instances  of  exposure  to  suspected  poisons  to  FDA  (243).      Reports,  for 
instance,  of  childhood  poisonings  to  FDA  have  declined  since  the  introduction  of 
child-resistant  caps  for  medication  containers,  and  among  children  <  5  years  of  age, 
flavored  chewable  vitamins  are  now  the  most  common  pharmaceutical  product  associated 
with  poisoning.   Information  from  poison-control  centers  has  also  been  used  to  monitor 
acute  occupation- related  health  events  such  as  exposure  to  agricultural  chemicals  and 
corrosive  chemicals  (250)  .      In  some  centers,  requests  for  information  on  treatment  for 
suspected  poisonings  may  be  collected  and  computerized  in  a  standard  form,  although  a 
standard  format  for  a  minimum  data  set  has  not  been  adopted.  Exchange  of  information 
by  national  and  international  organizations--such  as  the  American  and  the  European 
Associations  of  Poison  Control  Centers  and  the  World  Federation  of  Poison  Control 
Centers—facilitates  identification  and  treatment  of  persons  for  acute  conditions 
related  to  exposure  to  toxic  substances  (249) . 

Unlike  hospital-discharge  data,  information  from  emergency  rooms,  poison-control 
centers,  and  related  facilities  is  usually  not  available  routinely  in  a  standard 
format.   Efforts  are  under  way,  however,  to  create  standard  minimum  data  sets  and 
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reporting  formats  to  aggregate  and  compare  data.   With  the  increase  in  surgical  and 
other  procedures  performed  on  an  outpatient  basis,  the  importance  of  collecting  core 
information  from  outpatient  settings  will  increase. 

Ambulatory  Care  and  Related  Data 

With  the  exception  of  countries  such  as  Sweden  and  Canada  that  have  integrated  health- 
information  systems,  ambulatory-care  data  are  not  generally  available  from 
administrative  sources  for  all  segments  of  the  population.   Information  on  the 
prevalence  of  signs,  symptoms,  and  conditions  not  usually  requiring  hospitalization  is 
usually  obtained  through  periodic  surveys  of  the  general  population  or  through 
sentinel-surveillance  systems  characterized  by  voluntary  reporting  of  specific 
conditions  by  health- care  providers.   In  the  United  States,  a  Uniform  Ambulatory  Care 
Data  Set  (UACDS) ,  first  developed  in  1974  and  revised  in  199C,  offers  the  possibility 
for  standardization  of  ambulatory-care  data  (219)  ,    although  it  is  not  widely  used  at 
present.   At  present,  however,  diagnostic  information  is  often  not  required,  and  when 
included,  it  is  often  difficult  to  distinguish  actual  diagnoses  from  presumptive 
diagnoses  that  are  being  "ruled  out."   Inpatient  procedures  are  usually  coded  using 
the  ICD-9-CM,   but  a  universally  accepted  classification  system  is  not  used  in 
outpatient  settings.   The  Current  Procedure  Terminology,    fourth  revision    (CPT-4)    and 
the  HCFA  Common  Procedure  Coding  System   (BPCS)   are  both  used,  although  CPT-4   codes  are 
not  equivalent  to  ICD-9-CM  codes  used  for  the  same  procedures  in  inpatient  settings. 
With  rapid  changes  in  medical  care,  it  is  difficult  to  maintain  an  up-to-date 
procedure-classification  system. 

In  spite  of  these  limitations,  the  use  of  claims  and  related  data  from  public  programs 
for  surveillance  and  program  planning  is  increasing  in  the  United  States.   While  data 
from  public  programs  cover  only  a  segment  of  the  population,  they  are  the  segment  to 
which  public  health  interventions  are  most  often  targeted.  Information  from  the 
Medicaid  program,  in  particular,  has  been  used  by  state  and  local  health  departments. 
About  23  million  individuals  were  enrolled  in  Medicaid  programs  in  Fiscal  Year  1988, 
accounting  for  about  10%  of  personal  health-care  expenditures  in  the  United  States 
(123)  .     The  eligible  population,  however,  changes  substantially  over  time. 
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Because  the  states  have  broad  discretion  in  administering  the  program  under  federal 
guidelines,  benefits  vary  from  state  to  state,  as  do  the  health-information  systems 
used  to  track  health  claims.   The  states  report  aggregate  expenditure  and  utilization 
data  to  HCFA,  although  about  half  the  states  voluntarily  report  patient-level 
information  (107 ,228) .   Data  from  five  states  that  report  data  using  uniform 
enrollment,  provider,  and  claims- file  formats  can  be  aggregated,  but  otherwise, 
differences  in  eligibility,  covered  services,  and  file  structure  make  it  difficult  to 
aggregate  data  across  states.  Within  states,  however,  health  departments  are 
attempting  to  link  public  health  data  from  various  sources  to  monitor  the 
effectiveness  of  their  programs,  particularly  in  the  maternal  and  child  health  area 
{203)  .     Many  states  now  link  birth-  and  death-certificate  data  for  deaths  that  occur 
within  the  first  year  of  life.   Some  states  are  able  to  link  Medicaid  data  with  vital- 
record  data,  and  a  few  are  also  able  to  add  data  from  various  public  health  programs 
to  linked  Medicaid/vital-record  data  sets. 

Public  health  program  data  are  derived  from  various  sources:  maternal-  and  infant-care 
clinics;  vaccination  clinics;  neonatal  screening  programs  for  inborn  errors  of 
metabolism,  maternal  drug  use,  and  HIV  seroprevalence;  lead-screening  programs  for 
schoolchildren;  clinics  for  children  with  special  needs;  families  enrolled  in  the 
Women,  Infants,  and  Children  (WIC)  nutrition  supplement  programs;  hospital  discharge 
data;  data  from  the  Pregnancy  Risk  Assessment  Monitoring  System  (PRAMS) ;  school 
vaccination  records;  and  data  from  Head  Start  programs  (203,252).   State  and  local 
health  departments  have  met  with  varying  levels  of  success  in  linking  data  sets,  but 
the  most  successful  have  been  able  to  target  and  evaluate  public  health  interventions 
and  to  monitor  outcomes.   In  Tennessee,  for  instance,  adverse  sequelae  following 
vaccination  were  monitored  using  linked  vaccination-clinic  records,  Medicaid-claims 
data,  and  vital  records  {252) .     Also  in  Tennessee,  birth  certificate  and  WIC  data  were 
linked  to  assess  the  extent  to  which  high-risk  infants  were  enrolled  in  county  WIC 
programs  (253) .  Massachusetts  and  Colorado  are  among  the  states  that  are  redesigning 
data  bases  for  public  health  programs  so  that  the  data  can  be  linked  more  easily 
{203,251)  . 

Some  information  derived  from  state  and  local  public  health  programs  is  available 
nationally  in  the  United  States.  CDC's  Pediatric  Nutrition  Surveillance  System  and 
Pregnancy  Nutrition  Surveillance  System  have  been  operational  since  1973  and  1980, 
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respectively  (203, 251) .      In  both  systems,  key  indicators  of  nutrition  status  are 
monitored  continuously  in  participating  states  using  information  derived  from 
publicly-funded  health,  nutrition,  and  food-assistance  programs.   Information  is 
available  from  40  states  for  the  pediatric-nutrition  system  and  from  16  states  for  the 
pregnancy-nutrition  system.  These  data  sets  have  been  used  to  assess  the  prevalence 
of  malnutrition  in  children  <  2  years;  to  assess  the  prevalence  of  anemia  during 
pregnancy  among  low-income  women;  and  to  monitor  the  decline  in  the  prevalence  of 
anemia  among  low-income  children  in  the  United  States  (254-256)  . 

Although  few  countries  have  integrated  health-information  systems  at  present,  they  may 
become  more  common  in  the  future.  Although  not  integrated  and  not  inclusive  of  most 
of  the  population,  data  from  the  patchwork  of  administrative  systems  available  at 
present  have  been  used  successfully  for  public  health  surveillance  and  program 
planning.   In  the  United  States,  computerized  hospital  discharge  data  are  relatively 
standardized,  but  access  is  limited  in  some  states.   Because  data-reporting  formats 
are  less  standardized  for  outpatient  settings,  it  is  difficult  to  aggregate  such  data. 
Efforts  by  state  health  departments  to  create  integrated  data  bases  for  public 
programs  will  help  states  to  monitor  their  programs  more  effectively.  Although 
eligibility  may  vary  among  states,  standardization  and  reporting  of  data  for  at  least 
some  core  variables  could  enhance  information  available  nationwide  on  problems  of 
public  health  importance. 

SUMMARY 

Sources  of  data  available  for  public  health  surveillance  vary  considerably  from 
country  to  country.   Developed  and  many  developing  countries  are  able  to  monitor 
reproductive  outcomes  and  mortality  through  vital  statistics  systems  and  many 
countries  have  notifiable  disease-reporting  systems  for  at  least  some  infectious 
diseases.   Otherwise,  the  extent  of  information  available  through  administrative  data 
systems,  surveys,  registries,  and  sentinel  surveillance  systems  varies  extensively 
from  country  to  country.   Although  the  quality  and  the  completeness  of  these  data 
sources  may  be  limited,  they  often  provide  low-cost  information  that  is  useful  for 
public  health  surveillance  and  related  activities.   Even  if  new  data-collection 
efforts  are  needed  to  address  specific  problems,  routinely  collected  data  can  provide 
background  information  that  will  be  useful  for  designing  these  studies. 
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The  increasing  computerization  of  health  information,  the  availability  of  powerful  but 
relatively  inexpensive  computers,  and  the  development  of  user-friendly  software  should 
facilitate  the  timely  use  of  information  from  a  wide  range  of  sources.   Although 
integrated  health-information  systems  and  computerized  medical  records  may  be  on  the 
horizon  in  some  countries,  limited  information  that  is  available  quickly  from 
notifiable-disease  and  sentinel-surveillance  systems  is  often  the  most  useful  for 
conditions  in  which  timely  public  health  action  is  needed.  Since  no  one  source  of 
data  is  usually  adequate,  good  public  health  decision-making  invariably  requires  the 
synthesis  of  data  of  varying  quality  from  a  wide  range  of  sources  as  well  as  critical 
interpretation  of  findings. 
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Appendix  III. A.   Surveillance  or  Health  Information  Systems 
Mentioned  in  Chapter  III 

I.  Notifiable  diseases  and  related  reporting  mechanisms 

NNDSS  National  Notifiable  Diseases  Surveillance 

System,  United  States  (CDC  and  state  health 
departments) 

VAERS         Vaccine  Adverse  Event  Reporting  System, 
United  States  (FDA) 

II.  Vital  Statistics 

121-City  Surveillance  System,  United  States 
(CDC) 

MSS  Mortality  Surveillance  System,  United  States 

(NCHS/CDC) 

NTOF  National  Traumatic  Occupational  Fatality 

surveillance  system,  United  States 
(NIOSH/CDC) 

Medical  Examiner/Coroner  Information  Sharing 
System,  United  States  (NCEH/CDC) 

FARS  Fatal  Accident  Reporting  System,  United 

States  (NHTA) 

III.  Sentinel  surveillance 


SENSOR         Sentinel  Event  Notification  System  for 

Occupational  Risks,  United  States  (NIOSH/CDC) 

EEARDN         European  Electronic  Adverse  Drug  Reaction 
Network,  Europe 


IV.  Registries 


Connecticut  Tumor  Registry,  United  States 


SEER  Surveillance,  Epidemiology,  and  End  Results 

Program,  United  States  (NCI) 
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MACDP 


Metropolitan  Atlanta  Congenital  Defects 
Program,  United  States  (NCEH/CDC) 


V .  Surveys 
GHS 

NHIS 

BRFSS 

YRBS 


NHDS 


NAMCS 


NDTI 


NSFG 


NHANES 


General  Household  Survey,  United  Kingdom 

Continuous  Household  Survey,  Ireland 

National  Health  Interview  Survey,  United 
States  (NCHS/CDC) 

Behavioral  Risk  Factor  Surveillance  System, 

United  States  (NCCDPHP/CDC  and  state  health 

departments) 

Youth  Risk  Behavior  Surveillance  System, 

United  States  (NCCDPHP/CDC  and  state  health 

departments) 

National  Hospital  Discharge  Survey,  United 
States  (NCHS/CDC) 

National  Ambulatory  Medical  Care  Survey, 
United  States  (NCHS/CDC) 

National  Disease  and  Therapeutic  Index, 
United  States  (private  sources) 

National  Survey  of  Family  Growth,  United 
States  (NCHS/CDC) 

National  Health  and  Nutrition  Survey,  United 
States  (NCHS/CDC) 

Hispanic  Health  and  Nutrition  Survey,  United 
States  (NCHS/CDC) 

HANES  I  Followp-up  Study,  United  States 
(NCHS/CDC) 


VI.  Administrative  data-collection  systems 
PAS 


MDHIS 


BDMP 


Professional  Activity  Studies,  United  States 
(CPHA) 

McDonnell  Douglas  Hospital  Information 
System,  United  States 

Birth  Defects  Monitoring  Program,  United 
States  (NCEH/CDC) 
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MEDPAR 


HASS 


EHLASS 


NEISS 


DAWN 


PRAMS 


Medicare  Provider  Analysis  and  Review,  United 
States  (HCFA) 

Home  Accident  Surveillance  System,  United 
Kingdom 

European  Home  and  Leisure  Accident 
Surveillance  System,  Europe 

National  Electronic  Injury  Surveillance 
System,  United  States  (CPSC) 

Drug  Abuse  Warning  Network,  United  States 
(NIDA) 

Pregnancy  Risk  Assessment  Monitoring  System, 
United  States  (NCCDPHP/CDC  and  state  health 
departments) 
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Chapter   IV 


Management   of   the  Surveillance   System 
and  Quality  Control   of  Data 
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"It  is  possible  to  fail  in  many  ways... while  to  succeed  is  possible  only  in  one  way 
(for  which  reason  also  one  is  easy  and  the  other  difficult- -to  miss  the  mark  easy,  to 

hit  it  difficult) . " 

Aristotle 


INTRODUCTION 

This  chapter  provides  a  description  of  practical  management  and  quality  control  of  a 
disease-reporting  system  for  notifiable  diseases,  at  the  disease-  and  injury-report- 
gathering  stage--as  in  a  city/county  health  department,  state  health  department,  or 
within  the  federal  government.   It  focuses  on  disease-reporting  systems  for  notifiable 
diseases.   It  is  important  to  note  that  in  most  health  jurisdictions  there  are  laws 
that  specify  which  diseases  and  injuries  are  reportable,  who  is  responsible  for 
reporting,  and  what  method  and  timing  of  reporting  are  to  be  used  (e.g. ,  by  telephone 
within  24  hours  of  diagnosis  or  by  mail  within  1  week  of  diagnosis)  (1) .  Because 
these  reporting  laws  differ  by  geographic  locale  and  municipal  unit,  the  material  in 
this  chapter  is  restricted  to  a  general  overview  of  a  disease-surveillance  system, 
recognizing  that  aspects  may  not  be  applicable  to  all  areas  and  that  issues  specific 
to  jurisdictions  are  not  covered  completely.  The  term  "state"  is  used  in  this 
discussion;  although  "state"  is  a  geographic  designation  in  the  United  States, 
analogous  geographic  units  have  similar  functions  in  other  countries. 
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Types  of  Reports  and  Surveillance  Systems 

There  are  three  categories  of  notifiable  disease  reports:  a)  those  in  which 
information  is  collected  on  each  individual  with  the  disease  or  injury;  b)  conditions 
for  which  only  the  total  number  of  patients  seen  is  reported;  and  c)  conditions  for 
which  the  total  number  of  cases  is  reported  if,  and  only  if,  there  is  judged  to  be  an 
epidemic.   Each  category  generally  requires  specific  forms.   Once  a  report  has  been 
received,  for  many  conditions  a  nurse  or  other  disease  investigator  may  request  that 
the  reporting  unit  provide  information  for  additional  disease/injury-investigation 
forms. 

A  traditional  way  of  classifying  a  surveillance  system  is  as  passive  or  active  (2) .  A 
passive  surveillance  system  can  be  described  as  one  with  which  the  health  jurisdiction 
receives  disease/ injury  reports  from  physicians  or  other  individuals  or  institutions 
as  mandated  by  state  law.   In  contrast,  an  active  surveillance  system  is  established 
when  the  health  department  regularly  contacts  reporting  sources  (e.g.,  once  per  week) 
to  elicit  reports,  including  negative  reports  (no  cases).  An  active  surveillance 
system  is  likely  to  provide  more  complete  reporting  but  is  much  more  labor  intensive 
and  is  therefore  more  costly  to  operate  than  a  passive  system. 

In  most  surveillance  systems,  any  health  worker  who  has  knowledge  of  an  individual 
with  a  reportable  condition  may  be  required  to  report  that  case  to  the  health 
department.   In  a  sentinel  surveillance  system,  only  selected  physicians  or 
institutions  report  disease  or  injury.   Proponents  of  sentinel  systems  maintain  that 
it  is  preferable  to  receive  disease/injury  reports  of  high  quality  from  a  few  sources 
than  to  receive  data  of  unknown  quality  from  (in  theory)  all  potential  reporting 
sources  in  a  population.  This,  of  course,  presupposes  that  the  reporters  in  a 
sentinel  system  will,  in  fact,  provide  high-quality  information  on  a  reliable  basis. 
It  should  also  be  noted  that  sentinel  systems  are  inadequate  when  every  case  of  a 
particular  condition  needs  to  be  identified. 

Most  states  have  comprehensive,  passive  disease  surveillance  systems.  For  example, 
"as  required  by  law  in  all  50  U.S.  states,"  any  health  worker  having  knowledge  of  a 
person  with  a  reportable  condition  is  obligated  to  report  that  case  to  the  local/state 
health  department  (1) .   Regular  contact  initiated  by  the  health  department  and 
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directed  to  all  possible  reporting  sources  is  not  feasible  or  required. 

Collection  of  Data 

Laws  for  reporting  disease  and  injury  at  the  state  and  local  levels  not  only  specify 
who  is  responsible  for  reporting,  but  to  whom  the  reports  are  to  be  directed.   In  the 
least  complicated  reporting  situation,  a  physician  diagnoses  a  reportable  condition 
and  sends  the  appropriate  report  form  to  the  local  health  department,  where  the  data 
on  that  case  are  added  to  the  appropriate  disease/injury-surveillance  system. 
Summaries  of  reports  are  reviewed  regularly  and  analyzed  by  staff  at  the  local  health 
department  to  identify  any  conditions  that  are  being  reported  more  frequently  than 
expected  on  the  basis  of  past  experience.  After  disease/injury  reports  have  been 
processed  at  the  local  level,  the  information  is  forwarded  to  the  state  health 
department  to  be  consolidated  with  reports  from  other  local  health  departments,  and 
the  composite  data  are  examined  for  trends.   Each  state  health  department  then 
voluntarily  reports  these  cases  to  the  Centers  for  Disease  Control  (CDC)  on  a  weekly 
basis  (3) . 

This  reporting  scheme  can  be  reasonably  effective,  but  problems  can  arise.   For 
example,  how  does  one  notify  health-care  professionals  about  the  requirements  and 
procedures  for  reporting  to  the  health  department?  Who  is  responsible  for  such 
notification?  How  are  new  practitioners  in  the  jurisdiction  identified  and  notified 
of  their  responsibility  to  report?  who  provides  quality  assurance  for  the  process? 
How?  At  what  frequency?  Other  issues  include  reporting  of  suspected  cases  while 
laboratory  results  are  pending,  the  desired  routing  of  reports,  the  mechanism  for 
updating/completing  reports  as  additional  information  is  received,  reporting  of 
disease/injury  among  transients  (e.g.,  military  personnel  or  migrant  workers),  and 
defining  appropriate  time  frames  for  reporting  a  case  of  a  specific  disease/injury 
(Table  IV. 1) . 

There  may  not  be  one  correct  answer  to  each  of  the  questions  formulated  in  Table  IV. 1 
that  applies  in  all  situations;  the  answers  are  often  situation  dependent.  However,  a 
disease-  or  injury-surveillance  system  should  document  how  to  respond  to  each  of  the 
above  questions  so  that  disease  reporting  is  performed  in  a  consistent  manner  for  each 
disease. 
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Entry  of  Data  into  the  Surveillance  System 

With  the  availability  of  microcomputers,  many  health  departments  enter  disease/injury 
reports  into  computerized  data  bases.   It  is  essential  that  one  person  be  responsible 
for  management  of  the  surveillance  data  base  (i.e.,  to  be  designated  and  to  act  as  the 
data-base  manager  (DBM)  (4)  .      A  primary  responsibility  of  the  DBM  is  maintaining  the 
integrity  and  completeness  of  the  data  base.  Concerns  of  the  DBM  are  summarized  in 
Table  IV. 2. 

Checklist  for  Data-Base  Manager 

With  any  surveillance  system  for  disease/injury,  there  is  a  need  to  establish 
procedures  for  maintenance  and  retention  of  paper  disease- report  forms  (called  "source 
documents").   In  general,  the  individual  disease  reports  are  filed  by  year  of  report 
(or  onset),  by  disease,  and  in  alphabetical  order  by  the  patient's  last  name.   If  not 
already  specified  by  disease-reporting  laws,  retention  periods  should  be  designated 
for  maintaining  these  files  for  reference  purposes.   Electronic  reporting  may  obviate 
the  need  for  redundant  paper  records.   (See  Chapter  XI  for  more  information  on 
computerized  surveillance  systems.) 

Documentation  and  Training 

Documentation  is  a  critical  step  in  the  development  of  a  computerized  system--but  one 
that  is  often  neglected.  A  users'  manual  if  needed  and  should  provide  both  general 
and  detailed  descriptions  of  the  system,  including  the  following  topics  (4): 

•  General  description  of  the  entire  system 

•  Detailed  procedures  for  installing  the  system 

•  Detailed  procedures  for  operating  the  system 

•  Detailed  procedures  for  maintaining  the  system 

The  DBM  should  maintain  contact  with  the  programmer  for  the  system  so  that 
modifications  to  record  formats  and  programs  can  be  documented  by  the  manager;  the 
programmer  should  also  maintain  a  file  of  all  such  changes.  Thorough,  clear 
documentation  facilitates  the  addition  of  new  programs  and  modifications  in  equipment 
or  operations  (4) . 


109 

A  formal  training  program  should  be  established  for  persons  involved  in  the  daily 
operation  of  the  surveillance  system.   These  staff  members  must  feel  that  they  can 
participate  in  shaping  the  system,  and  their  ideas  and  comments  should  be  elicited  as 
part  of  the  training  process  (4) .      The  DBM  should  schedule  a  series  of  training 
classes  that  include  hands-on  experience  with  the  data-base  software.   Written 
operational  procedures—including  guidelines  for  interpreting  information  contained  in 
the  disease/injury  report  forms — should  be  distributed  and  explained  at  this  time. 
Software  tutorial  packages  and  videotapes  (interactive  or  presentational)  can  also  be 
useful  tools  for  training. 

Management  of  the  organization  responsible  for  the  surveillance  system  should  also  be 
oriented  to  the  system  in  one  or  several  briefing  sessions. 

Analysis  and  Standard  Reports 

An  effective  surveillance  system  must  be  designed  to  cover  all  the  following  areas  in 
its  reporting  process: 

•  Determining  whether  a  condition  is  being  reported  more  frequently  than 
expected  (see  Chapter  V) 

•  Responding  appropriately  to  reports  of  individual  cases 

•  Detecting  clusters  of  cases 

•  Notifying  public  health  practitioners  of  the  presence  of  specific 
conditions  in  their  areas 

•  Reinforcing  the  importance  of  reporting  through  facilitating  effective 
control /prevent ion  activities 

The  completeness  and  timeliness  of  case  reports  in  the  surveillance  system  should  be 
assessed  regularly.  This  assessment  should  include  both  the  proportion  of  the  reports 
with  each  variable,  such  as  age  of  patient  or  date  of  onset  of  the  condition,  date 
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completed,  and  time  between  onset  of  condition  and  receipt  of  report.  At  the  local 
health  department,  this  information  can  be  analyzed  by  reporting  source  (e.g., 
clinicians  or  hospital  or  diagnostic  laboratory  staff)  or,  at  the  state  level,  by 
health  jurisdiction.   These  analyses  should  identify  groups  or  institutions  in  need  of 
additional  information  or  training  on  disease  reporting. 

Most  surveillance  systems  for  infectious  disease  rely  primarily  on  receipt  of  case 
reports  from  physicians  and  other  health-care  providers.   To  encourage  reporting  by 
these  health  professionals,  many  local  health  departments  and  most  state  health 
departments  publish  newsletters  containing  data  and  other  information  of  interest  to 
the  contributors  to  the  data  base  (1)  .  Such  newsletters  may  include  standard  tabular 
reports  of  the  occurrence  of  a  reportable  condition  by  week  or  month,  with  a  year-to- 
date  summary.   They  may  also  include  narrative  reports  about  conditions  of  interest  or 
about  other  topics  relevant  to  public  health.   Such  feedback  is  important  to 
demonstrate  to  those  involved  with  the  system  that  the  data  are  being  used,  as  well  as 
to  accomplish  communications  goals  (see  Chapter  VII) . 

The  information  needs  of  management  and  operations  personnel  should  be  considered  as 
programs  are  developed  for  standard  reports  from  the  data  base.  Standard  reports 
should  include  information  on  time,  place,  and  person,  and  should  be  produced  in  a 
form  that  can  be  easily  interpreted  by  epidemiologists  and  management.   The  purpose  of 
each  report  should  dictate  the  appearance  of  the  output,  e.g.,  a  table,  map,  or  graph. 
Most  types  of  reports  should  be  produced  on  a  regular  basis  and  according  to  a  set 
schedule,  but  others  may  be  created  only  on  an  as-needed  basis. 

Data  Sharing 

In  some  situations  disease  and  injury  reports  may  be  shared  by  various  local  or  state 
health  departments,  particularly  with  conditions  that  require  additional  investigation 
or  follow-up.   For  example,  when  a  resident  of  one  county/state  is  examined  and  given 
a  particular  diagnosis  at  a  hospital  in  a  neighboring  county/state,  health  authorities 
need  to  be  able  to  track  the  condition  back  to  its  source  in  order  to  respond 
appropriately . 

Occasionally,  disease  and  injury  reports  are  sent  directly  to  the  state  health 
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department,  bypassing  the  local  health  department.   If  that  happens,  the  state  needs 
to  notify  the  appropriate  local  health  department  so  that  the  reports  can  be  added  to 
the  disease/injury  reporting  system  at  the  local  level.  Additional  data  that  the 
state  may  collect  should  also  be  shared  with  the  local  health  department. 

The  DBM  should  be  aware  of  other  sources  of  information  that  may  need  to  be  accessed 
and  compared  with  or  added  to  the  data  collected  in  his  or  her  own  system — e.g., 
laboratory  results,  epidemiologic  information  for  specific  conditions,  population 
estimates,  and  mortality  records.   Through  careful  planning  and  coordination  on  the 
part  of  managers  of  reporting  systems,  standard  coding  schemes  can  be  adopted  as  data 
systems  evolve.  These  actions  facilitate  the  sharing  and  use  of  data. 

System  Maintenance  and  Security 

Maintenance  of  a  system  should  be  directed  first  toward  reducing  errors  introduced 
through  flaws  in  design  and  through  content  changes  (e.g.,  changes  in  the  list  of 
notifiable  conditions)  and  second  toward  improving  the  system's  scope  and  services. 
Related  activities  can  be  categorized  as  routine  maintenance,  emergency  maintenance, 
requests  for  special  reports,  and  system  improvements.  Maintenance  should  not  be 
performed  on  an  informal  or  first-come,  first-served  basis.  An  effective  maintenance 
program  includes  the  following  steps  (4): 

•  Back  up  data  and  system  files  according  to  an  established  schedule,  and 
maintain  records  in  a  secure  environment. 

•  Require  that  requests  for  emergency  maintenance  be  made  in  writing  and 
entered  into  a  log. 

•  Assign  priorities  to  special  requests  on  the  basis  of  urgency  of  need  and 
time  and  resources  required. 

•  Institutionalize  routine  maintenance,  such  as  procedures  associated  with 
changing  to  a  new  reporting  year. 

•  Document  maintenance  as  it  is  conducted. 
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In  order  to  maintain  the  integrity  of  a  computer  system,  only  one  person  should  have 
the  authority  to  access  the  system  and  assign  and  change  passwords.  The  DBM  should  be 
the  only  staff  member  with  authority  to  install  or  modify  production  software.  This 
same  rule  should  apply  to  access  to  the  physical  computer  files.   Authority  to  add  or 
delete  files  from  subdirectories  or  environments  of  computers  should  be  delegated  to 
only  one  individual  who  is  then  held  accountable  for  all  modifications.   A  second 
computer  should  be  available  for  testing  changes  to  the  system  so  that  the  computer 
used  for  the  surveillance  system  can  be  reserved  for  production  only.  The  second 
computer  could  also  serve  as  a  back-up  computer  should  the  primary  machine  fail. 

The  numerous  risks  to  the  security  of  a  data  base  include  mechanical  failure,  human 
carelessness,  malicious  damage,  crime,  and  invasion  of  privacy.  Therefore,  back-up 
copies  of  the  data  base  should  be  kept  off-site  to  ensure  that  the  system  cannot  be 
deliberately  or  unintentionally  destroyed.  Updating  of  the  off-site  copies  should  be 
done  on  a  routine  basis,  and  new  diskettes  should  be  used  to  make  back-up  copies  at 
least  once  each  year. 

A  monthly,  total  system  back-up  is  recommended,  if  a  valid  copy  of  the  current  system 
is  available.   Data  files  that  are  changed  during  the  day  should  be  backed  up  at  the 
end  of  the  day. 

Computer  viruses  have  become  a  threat  to  data-base  and  computer-system  security. 
These  programs  can  be  highly  sophisticated  and  are  capable  of  attaching  themselves  to 
software  or  data  being  loaded  on  the  computer  or  data  being  sent  from  one  computer  to 
another.   Software  is  available  to  scan  entire  systems  or  diskettes  for  virus 
infections;  such  software  should  be  updated  periodically  because  of  the  addition  of 
new  viruses.   Data  received  via  telecommunications  channels  or  on  diskettes  from  other 
sources  should  always  be  scanned  before  data  files  and  programs  are  copied  to  the 
computer's  disk.   Software  retrieved  from  electronic  bulletin  boards  should  be 
carefully  examined  before  being  incorporated  into  a  system. 

In  the  event  of  extended  mechanical  failure,  a  contingency  plan  should  be  in  place  for 
shifting  the  base  of  operations  to  another  computer. 

Surveillance  data  on  disease/injury  are  generally  received  by  a  local  health 
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department,  forwarded  through  a  regional  health  center,  and  eventually  directed  to  the 
state  health  department.   The  complete  reporting  form,  which  includes  confidential 
information  on  patients,  is  usually  shared  by  local  and  state  health  departments  for 
purposes  of  follow-up  (if  necessary)  and  for  identifying  and  deleting  any  redundant 
(duplicate)  reports. 

Persons  who  report  disease/injury  should  be  familiar  with  the  types  of  activities  that 
may  follow  the  receipt  of  a  report.  For  example,  for  purposes  of  prevention  or 
treatment,  all  cases  of  syphilis  may  be  investigated  to  determine  the  source  of  the 
infection  and  potential  spread  of  the  infection  to  others.   Disease- reporting  laws 
may  specify  who  has  access  to  the  confidential  portions  of  a  disease/injury  report, 
and  it  is  important  to  assure  that  the  confidentiality  of  the  report  is  maintained. 
Failure  to  keep  the  reports  confidential  is  likely  to  lead  to  an  unwillingness  to 
report  on  the  part  of  physicians  and  other  health-care  providers.   Reports  and  files 
that  do  not  require  personal  identifiers  should  not  contain  them.   In  the  United 
States,  notifiable-disease  reports  received  from  states  by  CDC  do  not  include  personal 
identifiers  (such  as  name,  address,  and  telephone  number) . 

Modification  of  Reporting  Systems 

The  basic  steps  shown  below  are  intended  to  ensure  that  a  computer-based  surveillance 
system  will  meet  current  and  future  needs.  A  systems  analyst,  an  epidemiologist,  and 
the  final  users  of  information  from  the  system  should  work  together  to  produce  a 
system  that  is  user- friendly  and  functional  (5) . 

1.  Review  current  methods  of  processing  disease/injury  information.   Obtain  copies 
of  paper  forms  or  computer- screen  forms  or  reports.  Determine  whether  suggested 
report  forms  or  screens  are  available  from  state  or  national  agencies.  Often, 
ready-to-use  surveillance  software  is  available.  Use  of  such  systems 
facilitates  standardization,  quality  control,  and  comparability  of  data. 

2.  Review  with  management  and  users  any  problems  with  the  current  method  for 
processing  data  and  any  desired  future  enhancements. 

3.  Document  the  current  system  and  proposed  future  system.  Allow  concerned  parties 
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to  review  and  comment  on  their  understanding  of  objectives  for  the  system. 

4.  Limit  access  to  the  confidential  portion  of  a  disease/injury  report  as  much  as 
possible.  Store  the  original  report  forms  containing  confidential  data  in 
locked  cabinets  or  a  locked  room.   Secure  electronic  data  bases  by  limiting 
access  to  the  computer,  and  obtain  additional  security  through  the  required  use 
of  passwords  (pre-approved  for  access  to  the  protected  portion  of  the  data 
base) . 

5.  Document  developmental  specifications  to  meet  the  objectives  above.   In 
addition,  document  proposed  testing  schedules  and  methodology  for  implementing 
the  system  when  it  is  completed. 

6.  Develop  prototypic  screens  and  reports  for  management  and  end  users  to  review, 
so  that  misunderstandings  and  problems  can  be  identified  and  resolved  during 
development . 

7.  Once  all  parties  are  in  agreement,  establish  self-contained  modules  of 
development  that  can  be  completed,  and  proceed  to  the  testing  stage  while  other 
modules  are  being  developed. 

8.  Begin  development  in  a  test  environment  separate  from  any  current  computer-based 
production  system.   Document  any  changes  to  developmental  specifications  that 
become  necessary  during  actual  development. 

9.  Produce  processing  manuals  for  users  (to  include  not  only  the  operation  of  the 
computerized  system  but  also  proper  handling  of  paper  forms,  storage  of 
electronic  and  paper  data,  and  distribution  of  final  reports) .  This 
documentation  should  be  as  thoroughly  tested  as  the  actual  computer  system. 

10.  Establish  training  sessions  or  develop  tutorial  manuals  for  users.  If  such 
manuals  are  to  be  effective,  a  development/test  system  for  users  must  be  in 
place  during  their  training  stage. 

11.  Finalize  specification  documents  to  include  all  current  stages  of  the  system,  as 
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well  as  all  expected  future  enhancements.  This  documentation  should  include  a 
schedule  and  methodology  for  maintaining  and  troubleshooting  the  system. 

12.    Establish  and  document  proper  back-up  and  data- recovery  techniques.   This  step 
includes  selecting  a  data-base  manager. 

SUMMARY 

A  surveillance  system  of  high  quality  and  integrity  can  only  be  developed  through 
careful  planning,  documentation,  implementation,  training,  and  long-term  support. 
Because  of  the  changing  nature  of  disease/injury  reporting  (e.g.,  new  conditions  being 
added  or  case  definitions  being  modified) ,  useful  surveillance  systems  must  be 
flexible  enough  to  allow  for  such  changes  with  a  minimal  amount  of  disruption. 

Also  important  is  the  coordination  of  disease  and  injury- reporting  activities  among 
local  health  departments,  from  local  health  departments  to  their  appropriate  state 
health  departments,  and  among  state  health  departments.   The  Council  of  State  and 
Territorial  Epidemiologists  has  played  an  important  role  in  the  state-to-state 
coordination  of  disease  and  injury  reporting,  as  well  as  in  reporting  practices  from 
states  to  CDC. 

While  there  are  many  complicated  aspects  of  disease/injury-surveillance  systems,  it  is 
important  to  remember  that  the  overall  purposes  of  such  systems  are  to  provide 
information  on  preventing  disease  and  injury  and  to  improve  the  quality  of  the  public 
health. 
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INTRODUCTION 

Historically,  the  core  processes  of  public  health  surveillance  have  involved  using 
appropriate  methods  to  aggregate  the  units  of  data  being  collected- -namely ,  analysis- - 
and  also  creative  approaches  to  assess  the  emerging  data  patterns--namely , 
interpretation  (1). 

For  these  reasons,  the  ability  to  analyze  and  interpret  surveillance  data  determines 
the  mettle  of  the  epidemiologist.   Viewed  as  basic  to  observational  studies  (2), 
surveillance  is  at  the  forefront  of  the  spectrum  of  descriptive  epidemiology. 
Surveillance  has  a  myriad  of  uses  (3,4),    each  of  which  requires  careful  analysis  and 
interpretation.   Whether  surveillance  is  used  to  detect  epidemics,  suggest  hypotheses, 
characterize  trends  in  disease  or  injury,  evaluate  prevention  programs,  or  project 
future  public  health  needs,  data  from  a  surveillance  system  must  be  analyzed  carefully 
and  interpreted  prudently.   In  this  chapter  we  address  practical  and  methodologic 
approaches  to  surveillance  analysis;  the  presentation  of  surveillance  data  by  time, 
place,  and  person;  the  concept  of  rates  and  standardization  of  rates;  approaches  to 
exploratory  data  analysis;  the  use  of  graphics  and  maps;  and,  finally,  the  systematic 
interpretation  of  surveillance  data. 

APPROACH  TO  SURVEILLANCE  ANALYSIS 

Practical  Approach 

The  fundamental  approach  to  analyzing  surveillance  data  is  relatively  straightforward. 
Because  of  their  descriptive  nature,  surveillance  data  cannot  be  used  for  formal 
hypothesis  testing  (5).   Rather,  the  regular  scrutiny  of  systematically  collected 
information  allows  epidemiologists  to  describe  patterns  of  disease  and  injury  in  human 
populations,  organized  by  a  variety  of  sub-measures.  Moreover,  the  analysis  (and 
subsequent  interpretation)  proceeds  from  the  specific  elements  of  the  data  themselves. 
Thus,  surveillance  analysis  represents  an  inductive  reasoning  process  in  which  the 
assembly  of  individual  units  eventually  produces  a  more  general  picture  of  health- 
related  problems  in  a  population. 

Frequently,  the  time-consuming  problems  of  collecting,  managing,  and  storing 
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surveillance  data  leave  little  energy  for  the  analysis  itself.  Nonetheless,  analyzing 
surveillance  data  must  be  afforded  a  high  priority  by  those  in  charge  of  surveillance 
systems  (3).  Approaches  to  analyzing  surveillance  data  include  the  following  steps: 

1.  Know  the  inherent  idiosyncracies  of  the  surveillance  data  set.   It  is 
tempting  to  begin  immediately  to  examine  trends  over  time.  However, 
intimate  knowledge  of  the  day-to-day  strengths  and  weaknesses  of  the 
data-collection  methods  and  the  reporting  process  can  provide  a  "real 
world"  sense  of  the  trends  that  emerge. 

2.  Proceed  from  the  simplest  to  the  most  complex.  Examine  each  condition 
separately,  both  by  numbers  and  crude  trends.  How  many  cases  were 
reported  each  year?  How  many  cases  were  reported  in  each  age  group  each 
year?  What  are  the  variable-specific  rates?  Only  after  looking 
separately  at  each  variable  should  one  examine  the  relationships  among 
these  variables. 

3.  Realize  when  inaccuracies  in  the  data  preclude  more  sophisticated 
analyses.   Erratically  collected  or  incomplete  data  cannot  be  corrected 
by  complex  analytic  techniques.   Differential  reporting  (see 
representativeness  Chapter  VIII)  by  different  regions  or  by  different 
health  facilities  render  the  resulting  surveillance  data  set  liable  to 
misinterpretation . 

Me thodo logic  Considerations 

Analysis  of  surveillance  information  depends  on  the  accuracy  of  that  information 
(Chapter  VIII).  Attempts  to  analyze  data  that  are  haphazardly  collected  or  have 
varying  case  definitions  waste  valuable  time  and  resources.  The  two  key  concepts 
which  determine  the  accuracy  of  surveillance  data  are  reliability  and  validity  (5) . 
Reliability  refers  to  whether  a  particular  condition  is  reported  consistently  by 
different  observers,  whereas  validity  refers  to  whether  the  condition  as  reported 
reflects  the  true  condition  as  it  occurs.   Ideally,  both  reliability  and  validity  can 
be  achieved,  but  in  practice,  reliability  (e.g.,  reproducibility)  is  easier  than 
validity  to  assess.   In  situations  involving  conditions,  such  as  laboratory  testing 
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for  infectious  diseases,  when  biologic  measures  complement  clinical  case  definitions, 
the  accuracy  of  the  data  can  be  more  completely  assured.   However,  in  the  context  of 
more  subjective  behavioral  aspects,  such  as  those  associated  with  lifestyles,  accuracy 
is  more  difficult  to  confirm. 

The  application  of  standard  statistical  techniques  to  the  analysis  of  surveillance 
data  is  dictated  by  the  limitations  of  the  data  themselves  and  the  flexibility  of  the 
epidemiologist/statistician  (5) .   In  a  sense,  because  the  essentials  of  sampling 
theory  have  not  been  satisfied,  no  statistical  testing  is  possible  with  the  often 
incomplete  surveillance  data  set.  However,  if  the  information  is  viewed  as  samples 
over  time,  apparent  clusters  of  health  events  can  be  evaluated  for  their  statistical 
■significance."   Applying  95%  confidence  limits  or  other  standard  statistical  tests  to 
these  'samples  over  time"  can  allow  a  determination  of  whether  any  differences  are 
unlikely  to  have  occurred  by  chance  alone. 

Surveillance  analyses  are  often  ecologic,  since  they  describe  trends  in  groups  of 
individuals.  Thus,  the  use  of  surveillance  data  may  be  especially  prone  to  the 
problem  of  the  "ecological  fallacy"  (6,7).   In  brief,  this  type  of  bias  may  occur  when 
health  officials  interpreting  observations  about  groups  (e.g.,  aggregated  surveillance 
data)  make  causal  inferences  about  individual  phenomena  (8) .  These  population-level 
analyses  may  suffer  from  two  separate  problems  (7):  a)  aggregation  bias — due  to 

loss  of  information  when  individuals  are  grouped  and  b)  specification  bias due  to 

the  definition  of  the  "group"  itself  (8) .   The  chances  of  the  ecological  fallacy  can 
be  reduced  by  analyzing  subsets  of  surveillance  data  to  reveal  trends  in  the 
individual  characteristics.   However,  when  describing  bodies  of  surveillance  data, 
public  health  officials  usually  synthesize  the  populations  trends,  thus  opening  the 
possibility  for  fallacious  interpretation. 

Time,  Place,  and  Person 

Surveillance  data  allow  public  health  officials  to  describe  health  problems  in  terms 
of  the  basic  epidemiologic  parameters  of  time,  place,  and  person.   In  addition, 
surveillance  data  permit  comparisons  among  these  different  parameters  (e.g.,  what  are 
the  patterns  of  disease/injury  at  one  time  compared  with  another,  in  one  place 
compared  with  another,  or  among  one  population  compared  with  another) .   Use  of 
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appropriate  census  data  as  denominators  allows  calculation  of  rates,  which  then 
facilitates  comparison  of  the  risks  of  disease  or  injury  in  terms  of  the  parameters  of 
time,  place,  and  person.  Moreover,  use  of  these  fundamental  variables  permits  the 
epidemics  to  be  detected,  long-term  trends  to  be  monitored,  seasonal  patterns  to  be 
assessed  and  future  occurrence  of  disease/injury  to  be  projected,  thus  possibly 
facilitating  a  more  timely  public  health  response. 

Time 

Analysis  of  surveillance  data  by  time  can  reveal  trends  in  disease/ injury .  For  all 
health  conditions,  a  measurable  delay  occurs  between  the  exposure  and  the  problem.   In 
the  case  of  disease,  an  interval  exists  between  exposure  and  expression  of  symptoms, 
as  well  as  an  interval  between  a)  onset  of  symptoms  and  diagnosis  of  the  problem,  and 
b)  eventual  reporting  of  the  illness  to  public  health  authorities  so  that  it  can  be 
included  in  the  surveillance  data  set.  For  an  infectious  disease,  this  last  interval 
may  represent  days  or  weeks,  whereas  for  chronic  disease  it  may  be  measured  in  years. 
Thus,  choosing  the  appropriate  interval  for  analysis  must  involve  a  consideration  of 
the  health  condition  being  assessed. 

Analysis  of  surveillance  data  by  time  can  be  conducted  in  several  different  ways  to 
detect  changes  in  incidence  of  disease/ injury .  The  easiest  analysis  is  usually  a 
comparison  of  the  number  of  case  reports  received  during  a  particular  interval  (e.g., 
weeks  or  months)  (see  Figure  1.1)  .  Such  data  can  be  organized  into  a  table  or  graph 
to  assess  whether  an  abrupt  increase  has  occurred,  whether  the  trends  are  stable,  or 
whether  a  gradual  rise  or  fall  in  the  numbers  occurs.   Another  simple  method  of 
analysis  compares  the  number  of  cases  for  a  current  time  period  (e.g.,  a  given  month) 
with  the  number  reported  during  the  same  interval  for  the  past  several  years . 
Similarly,  the  cumulative  number  of  cases  reported  in  the  period  representing  the 
year-to-date  can  be  compared  with  the  appropriate  cumulative  number  for  previous 
years. 

Analyzing  long-term  (secular)  trends  is  facilitated  by  graphing  surveillance  data  over 
time.  The  watershed  events  that  influence  secular  trends--such  as  changes  in  the  case 
definition  used  for  surveillance,  new  diagnostic  criteria,  changes  in  reporting 
requirements  or  practices,  publicity  about  a  particular  condition,  or  new  intervention 
programs- -can  be  indicated  on  the  graph.   Changes  in  the  surveillance  system  itself 
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also  influence  long-term  trends,  particularly  when  the  intensity  of  active  case 
detection  increases  (e.g. ,  screening  programs  in  particular  communities) . 

Finally,  additional  epidemiologic  measures  enhance  the  analysis  of  surveillance  data 
by  time.  Using  denominators  to  calculate  rates  becomes  especially  important  if 
changes  occur  in  the  community,  such  as  the  immigration  of  a  new  population.  As  the 
size  of  a  population  changes  over  time,  so  will  the  expected  number  of  cases  of 
diseases  and  injuries.   In  addition,  analysis  by  date  of  onset  rather  than  date  of 
report  more  clearly  defines  the  condition.  Because  of  delays  between  diagnosis  and 
reporting,  using  date  of  onset  when  practical  and  possible  provides  a  better 
representation  of  actual  disease  incidence.  The  longer  the  interval  between  the 
occurrence  of  symptoms,  the  seeking  of  health  care,  and  the  reporting  of  events,  the 
greater  the  need  for  a  surveillance  system  based  on  date  of  onset. 

Place 

Analysis  by  the  place  where  the  condition  occurred  is  the  next  step.   (see  Figure 
1.2).   The  location  from  which  the  condition  was  reported  (such  as  a  hospital)  may  not 
be  the  place  where  the  exposure  actually  occurred  (in  the  community) .   Similarly,  for 
medical  procedures,  the  place  an  operation  took  place  may  not  be  the  place  of 
residence  of  the  patient.   For  example,  the  District  of  Columbia  has  the  highest  rate 
of  legal  abortions  in  the  United  States,  but  more  than  50%  of  this  figure  reflects 
women  who  reside  outside  the  District  (9). 

Locating  the  geographic  area  with  the  highest  rates  can  facilitate  efforts  to  identify 
cause (s)  and  allow  appropriate  interventions  to  be  applied.   John  Snow's  removing  the 
Broad  Street  pump  handle  remains  the  classic  example  of  intervention  by  location  (20)  . 
Even  in  situations  in  which  the  numbers  of  a  particular  problem  are  decreasing,  focal 
areas  with  high  levels  of  the  condition  may  remain,  and  the  identification  of  these 
areas  allows  prevention  resources  to  be  targeted  effectively.  Finally,  the  size  of 
the  unit  for  geographic  analysis  is  determined  by  the  type  of  condition  involved.  For 
some  rare  conditions,  large  areas  such  as  states  may  be  appropriate,  whereas  for 
events  that  occur  at  relatively  high  frequency  or  for  outbreak  situations,  areas 
defined  by  postal  codes  or  other  geographic  boundaries  may  be  the  most  desirable  size 
of  the  measure. 
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The  availability  of  computers,  as  well  as  software  for  spatial  mapping,  allow  more 
sophisticated  analysis  of  surveillance  data  by  place.   Public  health  officials  are  now 
able  to  use  surveillance  data  to  follow  the  geographic  course  of  a  particular 
condition,  thus  assisting  in  their  efforts  to  plan  intervention  strategies  (see  'Maps* 
below) . 

Parson 

Analyzing  surveillance  data  by  the  characteristics  of  persons  who  have  the  condition 
provides  further  specification.  The  demographic  variables  most  frequently  used  are 
age,  gender,  and  race/ethnicity.  Other  variables  such  as  marital  status,  occupation, 
and  levels  of  income  and  education  may  also  be  helpful,  even  though  most  surveillance 
systems  do  not  routinely  collect  such  information. 

Analysis  of  trends  in  disease/injury  by  age  depends  on  the  specific  health  condition 
of  interest.  For  childhood  diseases,  relatively  narrow  age  categories  (e.g.,  by 
single  years) ,  can  identify  the  age  group  associated  with  the  peak  incidence  of  a 
particular  health  condition.   Conversely,  for  conditions  that  primarily  affect  older 
populations,  broader  10-year  age  intervals  are  frequently  used.   In  general,  the 
typical  age  distribution  associated  with  the  health  condition  provides  the  best  guide 
to  deciding  which  age  categories  to  use,  with  several  narrower  categories  for  the  ages 
associated  with  peak  incidence  and  broader  categories  covering  the  remainder  of  the 
age  spectrum. 

Surveillance  systems  have  also  been  used  to  analyze  behavioral  characteristics  of 
populations.  Such  systems  generally  depend  on  self-reported  behavior  and  may  be  based 
on  repeated  surveys  of  representative  groups,  trends  in  markers  for  specific  types  of 
behavior  (e.g.,  sales  of  a  particular  product),  or  active  surveillance  of  a  particular 
behavioral  characteristic  or  indicator  in  a  defined  group  (e.g. ,  testing  urine  for 
drugs  in  school  or  work  settings) . 

If  possible,  the  characteristics  of  persons  included  in  any  surveillance  system  should 
be  related  to  denominators.  While  assessing  the  number  of  cases  alone  can  be 
sufficient,  variable-specific  rates  are  more  helpful  in  allowing  comparisons  of  the 
risk  involved.  Thus,  even  if  the  number  of  cases  of  a  particular  condition  is  higher 
in  one  part  of  a  population,  the  rate  may  be  lower  if  that  group  represents  a  large 
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proportion  of  the  population.   In  this  way,  comparing  the  rates  within  surveillance 
data  of  certain  populations  is  analogous  to  calculating  relative  risks  within 
observational  cohort  studies. 

Interactions  among  Time,  Place,  and  Person 

By  proceeding  from  the  simple  (e.g.,  crude  rates)  to  the  more  complex  (e.g.,  variable- 
specific  rates) ,  meaningful  trends  may  be  revealed.   This  is  because  interactions 
among  the  time-place-person  parameters  of  surveillance  data  can  obscure  important 
patterns  of  disease/injury  in  specific  populations.   For  example,  in  the  United  States 
in  the  1980s,  the  overall  number  of  syphilis  cases  fell  during  the  first  two-thirds  of 
the  decade  but  rose  beginning  in  1987  (Figure  V.l,  Panel  A) .  When  analyzed  by  gender 
(Figure  V.l,  Panel  B) ,  the  decline  in  syphilis  occurred  primarily  among  men;  cases 
among  women  were  low  for  the  first  5  years,  increased  slightly  in  1986,  and  rose  more 
rapidly  for  the  rest  of  the  decade.   Finally,  when  stratified  by  both  gender  and  race 
(Figure  V.l,  Panel  C) ,  the  decrease  in  numbers  of  cases  of  syphilis  was  seen  only 
among  white  males--presumably  among  men  who  have  sex  with  other  men  and  who  had 
changed  their  sexual  practices  in  response  to  human  immunodeficiency  virus  (HIV) 
prevention  activities  (12).     Conversely,  the  increase  in  syphilis  occurred  among  black 
men  and  women,  with  both  trends  beginning  in  1986,  and  being  linked  to  unsafe  sexual 
behavior  associated  with  use  of  crack  cocaine  (13) .      If  more  specific  analysis  by 
person  had  not  occurred,  the  offsetting  trends  in  the  mid-1980s  of  declines  among 
white  males  might  have  delayed  recognition  by  public  health  officials  of  the  syphilis 
epidemic  among  minorities. 

RATES    AND   RATE    STANDARDIZATION 

Overview 

A  rate  measures  the  frequency  of  an  event.   It  comprises  a  numerator  (i.e.,  the  upper 
portion  of  a  fraction  denoting  the  number  of  occurrences  of  an  event  during  a 
specified  time)  and  a  denominator  (i.e.,  the  lower  portion  of  a  fraction  denoting  the 
size  of  the  population  in  which  the  events  occur) .  A  crucial  aspect  of  a  rate  is  the 
specification  of  the  time  period  under  consideration.  An  optional  component  is  a 
multiplier,  a  power  of  10  that  is  used  to  convert  awkward  fractions  to  more  workable 
numbers  (14)  .      The  general  form  of  a  rate  is  shown  below: 
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rate  =    number  of  occurrences  of  event  in  specified  time    X  10°, 
average  or  mid- interval  population 

where  the  denominator  represents  the  size  of  the  population  during  the  specified 
period  in  which  the  events  occur  and  the  power  of  n  usually  ranges  from  2  to  6  (i.e., 
the  number  at  risk  varies  between  100  and  1,000,000).  The  selection  of  n  depends  on 
the  incidence  or  prevalence  of  the  event. 

Although  surveillance  often  provides  numerator  data  only,  the  use  of  raw  numbers  such 
as  cases  of  a  disease  or  injury  has  limitations.  Raw  numbers  quantify  occurrences  of 
an  event  during  a  specified  time  without  regard  to  population  size  and  dynamics,  or 
other  demographic  characteristics  such  as  distribution  by  race  and  gender.   Rates 
enable  one  to  make  more  appropriate,  informative  comparisons  of  occurrences  in  a 
population  over  time,  among  different  sub-populations,  or  among  different  populations 
at  the  same  or  different  times,  since  the  size  of  the  population  and  the  period  of 
time  specified  are  accounted  for  in  the  calculation  of  rates. 

A  wide  variety  of  "rates"  are  employed  in  standard  public  health  practice  (Table  V.l). 
These  measures  are  calculated  in  numerous  ways  and  may  have  different  connotations. 
Special  distinction  should  be  made  among  the  terms  'rate,-  "ratio,"  and  "proportion." 
A  ratio   is  any  quotient  obtained  by  dividing  one  quantity  by  another.   The  numerator 
and  denominator  are  generally  distinct  quantities,  neither  of  which  is  a  subset  of  the 
other.   No  restrictions  exist  on  the  value  or  dimension  of  a  ratio.   A  proportion   is  a 
special  type  of  ratio  for  which  the  numerator  is  a  subset  of  the  denominator 
population,  thus  requiring  the  resulting  quotient  to  be  dimensionless,  positive,  and 
less  than  one,  or  less  than  100  if  expressed  as  a  percentage.   Although  all  rates  are 
ratios,  in  epidemiology  a  rate  may  be  a  proportion  (e.g.,  prevalence  rate)  or  may  be 
limited  in  scope  by  further  restrictions  such  as  representing  the  number  of 
occurrences  of  a  health  event  in  a  specified  time  and  population  per  unit  time  (e.g., 
hazard  or  incidence  rate) .  This  latter  definition  is  most  restrictive  and  is  the 
definition  generally  used  for  rates  in  chemistry  and  physics. 

Use  of  Rates  in  Epidemiology 

Calculation  and  analysis  of  rates  is  critical  in  epidemiologic  investigations,  not 
only  for  formulating  and  testing  hypotheses  about  cause(s),  but  also  for  identifying 


127 

risk  factors  for  disease  and  injury.   Rates  also  allow  valid  comparisons  within  or 
among  populations  for  specific  times.  To  determine  rates,  one  must  have  reliable 
numerator  and  denominator  data,  the  latter  being  generally  more  difficult  to  obtain  in 
most  epidemiologic  investigations,  particularly  if  the  data  to  be  analyzed  (i.e,  the 
number  of  occurrences  of  an  event)  have  been  collected  from  public  health  surveillance 
systems. 

Crude,  Specific,  and  Standardized  Rates 

Crude  and  specific  rates 

Rates  can  be  calculated  either  for  the  entire  population  or  for  certain  subpopulations 
within  the  larger  group.   Rates  describing  a  complete  population  are  termed  "crude." 
The  computation  of  crude  rates  is  performed  as  the  initial  step  in  analysis  since  they 
are  important  in  obtaining  information  about  and  contrasting  entire  populations. 

Within  a  population,  the  rate  at  which  a  particular  health  event  occurs  may  not  be 
constant  throughout  the  entire  population.  To  examine  the  differences,  the  population 
is  partitioned  into  relevant  "specific"  subpopulations,  and  a  "specific  rate  is 
calculated  for  each  subset.  For  example,  if  one  calculates  death  rates  by  age  group 
(because  death  rate  is  not  constant  for  all  age  categories) ,  the  resulting  rates  are 
termed  "age-specific  death  rates." 

Variation  of  rates  among  population  subgroups  results  from  several  factors:  natural 
history  of  the  health  problem,  differential  distribution  of  susceptibility  or 
cause (s),  or  genetic  differences  among  subpopulations.   For  example,  mortality  rates 
are  higher  among  men  than  women  and  blacks  than  whites  (15) .      The  distribution  of 
subgroups  within  the  population  may  also  be  so  disparate  that  a  summary  rate  may  not 
convey  useful  information.   Therefore,  the  magnitude  of  a  crude  rate  depends  on  the 
magnitude  of  the  rates  of  the  subpopulations  as  well  as  on  the  demographics  of  the 
entire  population  (16).      These  variations  in  rates  across  a  population  would  remain 
unknown  if  only  crude  rates  were  calculated. 


Standardized  rates 

When  rates  are  compared  across  different  populations  or  for  the  same  population  over 
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time,  crude  rates  are  appropriate  only  if  the  populations  are  similar  with  respect  to 
factors  that  are  associated  with  the  health  event  being  investigated  (2  7) .   Such 
factors  could  include  age,  race,  gender,  socioeconomic  status,  or  risk  factors  (e.g., 
number  of  cigarettes  smoked) .   If  the  populations  are  dissimilar,  variable-specific 
rates  should  be  computed  and  compared.   Alternatively,  the  rates  can  be  adjusted  for 
the  effect  of  a  confounding  variable  in  order  to  obtain  an  undistorted  view  of  the 
effect  that  other  variables  have  on  risk.   This  adjustment  of  rates  when  comparing 
populations  is  called  standardization  and  yields  "standardized"  or  "adjusted"  rates. 
The  two  techniques  of  standardization  are  direct  and  indirect. 

Direct   standardization 

A  directly  standardized  rate  is  obtained  for  a  study  population  by  averaging  the 
specific  rates  for  the  population,  using  the  distribution  of  a  selected  standard 
population  as  the  averaging  weights.  This  adjusted  rate  represents  "what  the  crude 
rate  would  have  been  in  the  study  population  if  that  population  had  the  same 
distribution  as  the  standard  population  with  respect  to  the  variable (s)  for  which  the 
adjustment  or  standardization  was  carried  out"  (14).  The  rate  is  termed  "directly 
standardized"  because  specific  rates  are  used  directly  in  the  calculation.   If  data 
for  the  same  standard  population  are  used  to  calculate  directly  standardized  rates  for 
two  or  more  study  populations,  those  standardized  rates  can  be  appropriately 
compared.  Any  difference  among  the  standardized  rates  cannot  be  attributed  to 
differential  population  distributions  of  the  standardized  variable  because  the 
calculations  have  been  adjusted  for  that  variable  {18) .     The  following  data  must  be 
available  in  order  to  use  direct  adjustment: 

•  Specific  rates  for  the  study  population  and 

•  Distribution  for  the  selected  standard  population  across  the  same  strata 
as  those  used  in  determining  the  specific  rates. 

Indirect  Standardization 

An  indirectly  standardized  rate  is  calculated  for  a  study  population  by  averaging  the 
specific  rates  for  a  select  standard  population,  using  the  distribution  of  the  study 
population  as  weights.  One  should  use  indirect  adjustment  when  any  of  the  specific 
rates  in  the  study  population  are  unavailable  or  when  such  small  numbers  exist  in  the 
categories  of  strata  that  the  data  are  unreliable  (i.e.,  the  resulting  rates  are 
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unstable) .   This  commonly  occurs  in  occupational  mortality  or  in  small  geographic 
areas.   For  these  reasons,  indirect  standardization  is  used  more  often  than  direct 
standardization.   Indirectly  standardized  rates  for  two  or  more  populations  of 
interest  can  be  appropriately  compared  if  the  same  standard  population  is  used  in  the 
computations.   The  following  data  are  required  to  make  an  indirect  adjustment  to  a 
rate: 

•  Specific  rates  for  the  selected  standard  population, 

•  Distribution  for  the  study  population  across  the  same  strata  as  those 
used  in  calculating  the  specific  rates, 

•  Crude  rate  for  the  study  population,  and 

•  Crude  rate  for  the  standard  population. 

A  special  application  of  the  indirect  standardized  rate,  when  the  health  event  of 
interest  is  death,  is  the  standardized  mortality  ratio  (SMR) .   It  is  the  number  of 
deaths  occurring  in  a  study  population  or  subpopulation,  expressed  as  a  percentage  of 
the  number  of  deaths  expected  to  occur  if  the  given  population  and  the  selected 
standard  population  had  the  same  specific  rates  (19).      Explicitly,  the  SMR  is  an 
indirect,  age-adjusted  ratio  calculated  as  the  indirect  standardized  mortality  rate 
for  the  study  population,  divided  by  the  crude  mortality  rate  for  the  standard 
population.   Additional  information  is  available  on  the  use  of  the  SMR,  as  well  as  on 
computation  of  variance  and  confidence  intervals  for  direct  and  indirect 
measures  (18)  . 

Choice  of  Standard  Population 

If  crude  rates  are  to  be  adjusted,  an  appropriate  standardized  population  needs  to  be 

chosen.   In  extreme  cases,  the  choice  of  different  standardized  populations  can  lead 

to  different  results.  For  example,  use  of  one  standardized  population  may  yield  an 

adjusted  rate  higher  for  population  A  than  for  population  B,  while  choice  of  another 

standard  population  may  yield  a  higher  rate  for  population  B  (18)  . 

Two  factors  should  be  considered  when  choosing  a  standard  population: 

•  Select  a  population  that  is  representative  of  the  study  populations  being 
compared  and 

•  Understand  how  choice  of  a  standard  population  affects  directly 
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standardized  rates  (e.g.,  if  the  age-specific  rates  for  population  A  are 

greater  than  for  population  B  at  young  ages  and  the  opposite  is  true  at 

older  ages,  a  standard  population  with  distribution  skewed  to  younger 

ages  will  yield  a  higher  directly  standardized  rate  for  population  A  than 
for  population  B)  . 

Generally  the  choice  of  standard  population  makes  little  difference  in  comparing 
adjusted  rates.   Although  magnitudes  of  the  adjusted  rates  depend  upon  choice  of 
standard  population,  no  meaning  is  attached  to  those  magnitudes;  only  relative 
differences  in  the  adjusted  rates  can  be  assessed. 

Various  choices  are  available  for  a  standard  population.  Customary  selections  include 
the  combined  or  pooled  population  of  the  overall  population  to  be  studied,  the 
population  of  one  of  the  study  groups,  a  large  population  (such  as  the  1940  or  1980 
United  States  population) ,  or  a  hypothetical  population.  Calculating  standardized 
rates  using  different  standard  populations  allows  comparisons  of  different 
distributions  (20) . 

To  Standardize  or  Not   To  Standardize.      The  decision  to  standardize  is  not  always 
straightforward.   Several  factors,  most  of  which  are  data-driven,  must  be  considered 
in  the  decision  process.  Reasons  to  present  standardized  rates  include  the  following 
(17): 

•  Standardization  adjusts  for  confounding  variables  to  yield  a  more 
realistic  view  of  the  effect  of  other  variables  on  risk, 

•  A  summary  measure  for  a  population  is  easier  to  compare  with  similar 
summary  measures  than  are  sets  of  specific  rates, 

•  A  standardized  rate  has  a  smaller  standard  error  than  any  of  the  specific 
rates  (this  is  important  when  comparing  sub-populations  or  geographic 
areas) , 

•  Specific  rates  may  be  imprecise  or  unstable  because  of  sparse  data  in  the 
strata,  and 

•  Specific  rates  may  be  unavailable  for  certain  groups  of  interest  (e.g., 
small  populations  or  those  designated  by  specific  geographic  areas) . 
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The  major  disadvantage  of  standardization  is  evident  when  the  specific  rates  vary 
differently  across  strata,  such  as  when  they  move  in  different  directions  or  at 
different  magnitudes,  in  individual  age  groups.   In  this  case  the  trend  in  the 
standardized  rate  is  a  weighted  average  of  the  trends  in  the  specific  rates,  where  the 
weights  depend  on  the  standard  population  selected.  When  this  occurs,  the 
standardized  rate  tends  to  mask  the  differences,  and  no  single  summary  measure  will 
reveal  these  differences. 

Another  unfavorable  characteristic  of  standardized  rates  is  that  their  magnitude  is 
arbitrary  and  depends  entirely  on  the  standard  population.  Although  generally  not  the 
case,  relative  rankings  of  summary  measures  from  different  study  populations  may 
change  if  a  different  standard  population  is  selected. 

Regardless  of  the  decision  made  regarding  standardization,  it  is  crucial  to  evaluate 
the  specific  rates  to  characterize  accurately  and  to  understand  more  fully  the 
variation  among  study  populations.  Standardized  rates  should  never  be  used  as  a 
substitute  for  specific  rates,  nor  should  they  be  the  basis  of  inferences  when 
specific  rates  can  be  computed.  A  compromise  to  the  use  of  a  summary  measure  versus  a 
set  of  specific  measures  is  to  use  the  specific  rates  but  to  eliminate  or  combine 
categories  to  minimize  the  number  of  rates  required  for  comparison.  Additional 
discussion  is  available  on  advantages  and  disadvantages  of  standardization  and  on 
analyzing  crude  and  specific  rates  (21). 

Rate  standardization:   practical    example 

To  demonstrate  how  crude,  specific,  and  standardized  rates  are  obtained,  we  compare 
death  rates  in  two  Florida  counties.  This  example  shows  how  standardized  rates  can  be 
misleading  if  they  are  not  properly  scrutinized. 

We  will  use  population  and  death  totals  for  Pinellas  and  Dade  Counties  in  Florida  for 
1980  (Table  V.2).  The  crude  death  rate  for  Pinellas  County  is  about  60%  higher  than 
that  for  Dade  County.   When  the  age  distributions  of  each  county  are  used,  the 
resulting  age-specific  death  rates  are  generally  slightly  higher  in  Dade  County  (Table 
V.3),  even  though  the  crude  death  rate  is  substantially  higher  for  Pinellas  County. 
This  seeming  anomaly  in  the  data  results  from  the  different  age  distributions  of  each 
county.   Specifically,  the  population  in  Pinellas  is  older. 
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Directly  standardizing  the  Pinellas  and  Dade  County  rates  to  the  United  States  1980 
population  corrects  for  the  differences  in  population  (Table  V.4).  Once  differences 
in  age-related  distributions  in  the  two  counties  have  been  taken  into  account,  the 
adjusted  death  rate  for  Pinellas  County  is  lower  than  that  for  Dade  County  (7.7  and 
7.9,  respectively). 

The  indirect  method  of  adjustment  increases  the  relative  difference  between  death 
rates  for  the  two  counties  (Table  V.5).  The  adjusting  factor  is  computed  as  the  1980 
death  rate  for  the  total  U.S.  population  divided  by  the  expected  death  rate.  Then, 
adjusted  death  rate  is  calculated  as  the  adjusting  factor  multiplied  by  the  crude 
death  rate.   In  this  example,  indirect  adjustment  reinforces  and  accentuates  the 
results  of  direct  adjustment  by  yielding  rates  of  7.5  and  7.8  deaths  per  1,000 
population  for  Pinellas  and  Dade  Counties,  respectively. 

This  example  illustrates  the  importance  of  being  thoroughly  familiar  with  the  data. 
Comparison  of  crude  death  rates  alone  can  be  misleading.   However,  calculating  age- 
specific  and  adjusted  rates  permits  an  accurate  understanding  of  death  rates  in  these 
counties  and  shows  that  the  high  crude  rate  in  Pinellas  County  reflects  its  older 
population.   The  example  also  illustrates  how  the  magnitude  of  adjusted  rates  depends 
on  the  choice  of  standard  population. 

Analysis  of  Rates 

When  numerator  and  denominator  data  are  available,  analysis  of  rates  should  always 
begin  with  calculation  of  crude  rates  and  proceed  to  subsequent  computation  of 
relevant  specific  rates.   If  appropriate,  a  standard  population  can  be  chosen  to 
determine  standardized  rates.  Tables  and  especially  maps  are  important  means  of 
presenting  rates  at  different  times  and/or  locations.   (See  "Tables,"  "Graphs,"  and 
"Maps"  below) . 

Several  statistical  procedures  are  available  to  analyze  data.   Inference  on  a  single 
proportion  is  performed  using  a  z  test,  and  assessing  the  difference  between  two 
proportions  can  be  accomplished  with  a  z  or  x2   test  (17).*  Use  of  Poisson  parameters 


*Note  that  Fleiss  does  not  distinguish  between  rates  and  proportions  or  the  analysis 
of  them. 
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is  helpful  in  comparing  two  rates  (22) .     A  series  of  %'   tests  can  be  used  to  compare 
proportions  from  several  independent  samples  (16),    and  Poisson  regression  is 
frequently  used  for  comparing  several  rates  (23).     Other  modeling  procedures  that  can 
be  used  to  analyze  rates  include  smoothing,  Box-Jenkins,  and  Kalman  filter  approaches, 
all  of  which  are  time-series  methods  discussed  in  Chapter  VI.   Space-time  cluster 
techniques  and  small-area  estimation  methods  are  also  discussed  in  Chapter  VI. 

EXPLORATORY   DATA  ANALYSIS 
Overview 

Exploratory  data  analysis  (EDA)  is  enumerative,  numeric,  or  graphic  detective 

work  (24)  .      It  is  the  application  of  a  set  of  techniques  to  a  body  of  data  to  make  the 

data  more  understandable.  EDA  is  a  philosophy  that  minimizes  assumptions,  allows  the 

data  to  motivate  the  analysis,  and  combines  ease  of  description  with  quantitative 

knowledge.  EDA  leads  the  analyst  to  uncover  characteristics  often  hidden  within  the 

data. 

Practice  of  EDA  involves  four  fundamental  steps  (24-25)  : 

1.  Using  visual  displays  to  convey  the  structure  of  the  data  and  analyses, 

2.  Transforming  the  data  mathematically  to  simplify  their  distribution  and 
to  clarify  their  analysis, 

3.  Investigating  the  influence  that  unusual  observations  (outliers)  have  on 
the  results  of  analysis,  and 

4.  Examining  the  residuals  (the  difference  between  the  observed  data  and  a 
fitted  model)  to  provide  additional  insight  into  the  data. 

EDA  is  the  initial  step  in  any  analysis.  It  allows  the  investigator  to  become 
familiar  with  the  data  and  forms  the  foundation  for  further  analysis.  Although  most 
public  health  surveillance  systems  are  established  for  specific  topics,  proper  EDA  of 
the  data  can  provide  insight  into  demographic,  temporal,  and  spatial  patterns 
otherwise  overlooked  in  the  collection  of  numbers.   EDA  may  additionally  contribute  to 
more  timely  detection  of  unusual  observations,  which  may,  in  turn,  facilitate  a 
quicker  public  health  response  to  factors  that  cause  increased  morbidity  and/or 
mortality . 
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Data  Displays 

A  first  step  in  any  analysis  of  data  is  a  visual  examination  of  the  data.   A  few  of 

the  techniques  that  should  be  used  initially  are  described  below  for  application  to  a 

single  set  of  numbers,  for  exploration  of  relationships  between  two  factors,  and  for 
comparisons  among  several  populations. 

Dot  plots 

A  dot  plot  is  a  one-dimensional  plot  (Figure  V.2)  of  the  individual  values  of  a  set  of 
numbers.   The  x-axis  represents  one  or  more  categories  of  a  non-continuous  variable, 
and  the  y-axis  represents  the  range  of  values  displayed  by  the  observations. 
Observations  with  identical  values  are  plotted  side  by  side  on  the  same  horizontal 
plane. 

Stem-and-Leaf  Displays 

A  stem-and-leaf  display  is  a  graphic  (Figure  V.3)  that  allows  the  digits  of  the 
observation  values  to  sort  the  numbers  into  numerical  order  for  display.  This  is  a 
variation  of  the  conventional  histogram.  The  basic  principle  used  in  constructing  a 
stem-and-leaf  display  is  the  splitting  of  each  data  value  between  a  suitable  pair  of 
adjacent  digits  to  form  a  set  of  leading  digits  and  a  set  of  trailing  digits.  The  set 
of  leading  digits  forms  the  stems,  and  the  set  of  the  first  trailing  digit  from  the 
data  forms  the  leaves.   Remaining  trailing  digits  are  ignored  for  the  purpose  of  the 
graphic.   Variations  to  the  stem-and-leaf  display  are  possible  {24). 

Many  investigators  begin  an  evaluation  of  data  with  a  histogram  (see  below) ,  but  the 
stem-and-leaf  display  has  several  advantages  over  the  histogram.  Because  every 
observation  is  plotted  in  the  stem-and-leaf  display,  it  contains  more  detail  than  the 
histogram  and  allows  computation  of  percentages  points.   Moreover,  transformations  can 
be  applied  directly  to  stem-and-leaf  data. 

Scatter  plots 

The  scatter  plot  or  scatter  diagram  is  a  plot  (Figure  V.4)  that  reveals  the 
relationship  between  two  variables.  Each  observation  comprises  a  pair  of  values,  one 
for  each  variable.  The  observation  is  plotted  by  measuring  the  value  of  one  variable 
on  the  horizontal  axis  and  the  value  of  the  other  on  the  vertical  axis. 
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Data  summaries 

One  can  summarize  a  data  set  by  calculating  a  few  numbers  which  are  relatively  easy  to 
interpret.  For  example,  measures  of  central  tendency  and  variability  are  frequently 
used  to  describe  data.   In  particular,  two  types  of  summary  displays  have  proven 
useful  in  characterizing  data,  i.e.,  the  five-number  summary  and  the  box  plot. 

Five-number  summaries 

The  five-number  summary  of  a  data  set  is  a  simple  display  (Table  V.6)  involving  the 
median,  hinge,  and  extreme  values.  The  median  is  a  measure  of  the  central  tendency  of 
the  data  that  splits  an  ordered  data  set  in  half.  The  hinges  are  a  measure  of  the 
variability  of  the  data  and  are  the  values  in  the  middle  of  each  half.   Therefore,  the 
hinges  are  the  data  values  that  are  approximately  1/4  and  3/4  from  the  beginning  of 
the  ordered  data  set.   They  are  determined  by  formulas  [25)    and  are  similar  to 
quartiles  that  are  defined  so  that  1/4  of  the  observations  lie  below  the  lower 
quartile  and  1/4  lie  above  the  upper  quartile.   The  extremes  also  reflect  the 
variability  of  the  data  and  are  the  smallest  and  largest  values  in  the  data. 

Box  plots 

The  box  plot  is  a  graphic  representation  (Figure  V.5)  of  the  five-number  summary  with 
the  two  ends  of  the  box  representing  the  hinges  and  the  line  through  the  box 
representing  the  median.  A  line  runs  from  each  end  of  the  box  (i.e.,  from  each  hinge) 
to  the  corresponding  lower  and  upper  extreme  values.  This  plot  allows  the  reader  to 
see  quickly  the  median  level,  the  variability,  and  the  symmetry  of  the  data. 
Variations  of  the  box  plot,  including  identification  of  outlier  values,  are  possible 
(25)  . 

Transformations 

Transformation  or  re-expression  of  data  is  a  powerful  tool  that  facilitates 
understanding  their  implications.   If  numbers  are  collected  in  a  manner  that  renders 
them  hard  to  grasp,  the  data  analyst  should  use  a  transformation  method,  while 
preserving  as  much  of  the  original  information  as  can  be  used.  When  used 
appropriately,  transformed  data  can  be  readily  analyzed  and  interpreted. 

Raw  data  are  transformed  for  a  number  of  reasons--including  the  achievement  of 
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symmetry- -to  produce  a  straight-line  relationship,  to  allow  use  of  an  additive  model, 
to  reduce  variability,  and  to  attain  normally  distributed  data.   Symmetry  is  highly 
desirable  when  analyzing  a  single  data  set,  since  it  ensures  that  a  "typical"  value 
(such  as  the  mean  or  median)  more  nearly  summarizes  the  data.  When  analyzing  pairs  of 
data,  a  straight-line  relationship  is  important  because  linear  associations  are 
simple,  both  in  form  and  in  interpretation.   One  or  both  variables  can  be  transformed 
to  achieve  linearity.   Additive  models  have  the  desirable  feature  that  data  in  multi- 
way  tables  can  be  typically  decomposed  into  additive  effects  and  analyzed  accordingly. 
Reduced  variability  of  the  data  is  crucial  when  comparing  several  data  sets.   If  the 
data  spread  varies  with  the  data  set,  then  "typical"  values  are  obtained  more 
accurately  in  the  data  with  smaller  spread.  Finally,  normally  distributed  data  are 
needed  so  that  normal  theory  statistics  can  be  applied  to  test  hypotheses  and  draw 
inferences. 

Not  all  data  sets  can  be  transformed.  The  ratio  of  the  largest  to  smallest  value  in 
the  original  data  set  is  a  simple  indicator  of  whether  a  group  of  numbers  will  be 
affected  substantially  by  transforming.   If  the  ratio  is  near  1,  a  transformation  will 
not  severely  alter  the  appearance  of  the  data.  Since  transformations  affect  larger 
values  and  smaller  values  differently,  the  further  the  ratio  is  from  1,  the  greater 
the  need  is  for  transformation  to  display  and  understand  the  data  most  simply. 

Transformations  are  generally  accomplished  by  raising  each  value  of  the  data  set  to 
some  power  p.   Different  values  of  p  yield  different  effects  on  a  data  set,  but  those 
effects  are  ordered  if  the  values  of  p  are  ordered.   Some  transformations  are 
especially  effective  in  certain  instances  (Table  V.7) .  For  example,  the  square  root 
transformation  is  particularly  capable  of  reducing  variability  in  count  data. 
Guidelines  are  available  to  assist  in  selecting  appropriate  transformations  (24,25) . 

Smoothing 

Smoothing  refers  to  EDA  techniques  that  summarize  consecutive,  overlapping  segments  of 
a  series  of  data  to  produce  a  smoother  curve.   Its  goal  is  to  represent  patterns  in 
the  data  more  clearly  without  becoming  encumbered  with  any  detailed  peaks  and  valleys. 
Variations  in  the  data  set  caused  by  irregular  components  are  smoothed  so  that  the 
overall  trend  can  be  determined  more  readily.  Thus,  smoothing  allows  investigators  to 
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search  for  patterns  in  data  that  may  otherwise  be  masked. 

Smoothing  is  used  on  data  series  to  explore  the  relationship  between  two  variables. 
The  values  along  the  x-axis  should  be  equally  spaced.   The  y  values  are  called  a  time 
series  if  they  are  collected  over  successive  time  intervals,  although  these  values 
need  not  be  defined  by  time  (e.g.,  in  a  data  sequence  of  birth  rates  by  mother's  age)  . 
As  long  as  the  x-axis  defines  an  order  and  the  order  is  not  too  irregular,  the  y 
sequence  can  be  called  a  time  series,  and  smoothing  techniques  can  be  applied.   In 
time-series  analysis,  models  are  frequently  developed  on  the  smoothed  data  because 
these  data  are  generally  easier  to  model . 

Numerous  smoothing  approaches  exist,  each  having  its  own  assets  and  liabilities.  The 
simplest  example  of  smoothing  is  a  moving  average  of  three  intervals  in  which 
observation  y^  in  the  data  sequence  is  replaced  with  the  mean  of  y^j,  yi#  and  yitl. 
Discussions  of  smoothing  functions,  including  suggestions  on  how  to  overcome  the 
problem  of  obtaining  end  points  for  the  smoothed  series,  appear  elsewhere  (25-26)  . 

DATA  GRAPHICS 

Overview 

Visual  tools  play  a  critical  role  in  public  health  surveillance.  Data  graphics 
visually  display  measured  quantities  using  points,  lines,  a  coordinate  system, 
numbers,  symbols,  words,  shading,  and  color  (27).  Graphics  allow  researchers  to  mesh 
presentation  and  analysis.  Data  graphics  are  essential  to  organizing,  summarizing, 
and  displaying  information  clearly  and  effectively.  The  design  and  quality  of  such 
graphics  largely  determine  how  effectively  scientists  can  present  their  information. 

Many  visual  tools  are  available  to  assist  in  analysis  and  presentation  of  results. 
The  data  to  be  presented  and  the  purpose  for  the  presentation  are  the  key  factors  in 
deciding  which  visual  tools  should  be  used  (Table  V.8) .  Further  discussion  and 
guidance  in  producing  effective,  high-quality  data  graphics  are  available  from  several 
sources  (27-32)  . 

Tables 
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A  table  arranges  data  in  rows  and  columns  and  is  used  to  demonstrate  data  patterns  and 
relationships  among  variables  and  to  serve  as  a  source  of  information  for  other  types 
of  data  graphics  (28)  .      Table  entries  can  be  counts,  means,  rates,  or  other  analytic 
measures . 

A  table  should  be  simple;  two  or  three  small  tables  are  simpler  to  understand  than  one 
large  one.  A  table  should  be  self-explanatory  so  that  if  taken  out  of  context  readers 
can  still  understand  the  data.  The  guidelines  below  should  be  used  to  increase 
effectiveness  of  a  table  and  ensure  that  it  is  self-explanatory  (29) . 

Describe  what,  when,  and  where  in  a  clear,  concise  table  title. 

Label  each  row  and  column  clearly  and  concisely. 

Provide  units  of  measure  for  the  data. 

Provide  row  and  column  totals . 

Define  abbreviations  and  symbols. 

Note  data  exclusions. 

If  the  data  are  not  original,  reference  the  source. 

One -variable  tables 

One  of  the  most  basic  tables  is  a  frequency  distribution  by  category  for  a  single 
variable.  For  example,  the  first  column  of  the  table  contains  the  categories  of  the 
factor  of  interest,  and  the  second  column  lists  the  number  of  persons  or  events  that 
appear  in  each  category  and  gives  the  total  count .   Often  a  third  column  contains 
percentages  of  total  events  in  each  category  (Table  V.9). 


Multi-variable  tables 

Most  phenomena  monitored  by  public  health  surveillance  systems  are  complex  and  require 
analysis  of  the  interrelationships  of  several  factors.  When  data  are  available  on 
more  than  one  variable,  multi-variable  cross-classified  tables  can  elucidate 
associations.   These  tables  are  also  called  contingency  tables  when  all  the  primary 
table  entries  (e.g.,  frequencies,  persons,  or  events)  are  classified  by  each  of  the 
variables  in  the  table  (Table  V.10). 
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The  most  frequently  used  type  of  table  in  epidemiologic  analysis  is  the  two-by-two 
contingency  table,  which  is  appropriate  when  two  variables,  each  having  two 
categories,  are  studied.   This  special  case  is  particularly  suited  for  analyzing  case- 
control  and  cohort  studies  for  which  the  categories  of  the  variables  are  case   and 
control    (or  ill   and  well)    and  exposed   and  unexposed. 

Graphs 

A  graph  is  a  visual  display  of  quantitative  information  involving  a  system  of 
coordinates.  Two-dimensional  graphs  are  generally  depicted  along  an  x-axis 
(horizontal  orientation)  and  y-axis  (vertical  orientation)  coordinate  system.   Graphs 
are  primary  analytic  tools  used  to  assist  the  reader  to  visualize  patterns,  trends, 
aberrations,  similarities,  and  differences  in  data. 


Simplicity  is  key  to  designing  graphs.   Simple,  uncluttered  graphs  are  more  likely 
than  complicated  presentations  to  convey  information  effectively.   Several  specific 
principles  should  be  observed  when  constructing  graphs  (29) . 

•     Ensure  that  a  graph  is  self-explanatory  by  clear,  concise  labeling  of 

title,  source,  axes,  scales,  and  legends, 

Clearly  differentiate  variables  by  legends  or  keys. 

Minimize  the  number  of  coordinate  lines. 

Portray  frequency  on  the  vertical  scale,  starting  at  zero,  and  the  method 

of  classification  on  the  horizontal  scale. 

Assure  that  scales  for  each  axis  are  appropriate  for  the  data. 

Clearly  indicate  scale  division,  any  scale  breaks,  and  units  of  measure. 

Define  abbreviations  and  symbols. 

Note  data  exclusions. 

If  the  data  are  not  original,  reference  the  source. 


Several  commonly  used  graphs  are  described  below.  The  scatter  plot,  an  extremely 
helpful  graph  for  detecting  the  relationship  between  two  variables,  has  already  been 
described  (see  "Data  Displays"). 


Arithmetic-scale  line  graphs 
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An  arithmetic-scale  line  graph  is  one  in  which  equal  distances  along  the  x  and/or  y 
axes  represent  equal  quantities  along  that  axis.   This  type  of  graph  is  typically  used 
to  demonstrate  an  overall  trend  over  time  rather  than  focusing  on  particular 
observation  values.   It  is  most  helpful  for  examining  long  series  of  data  or  for 
comparing  several  data  sets  (see  Figure  1.1). 

The  scale  of  the  x-axis  is  usually  presented  in  the  same  increments  as  the  data  are 
collected  (e.g.,  weekly  or  monthly).   Several  factors  should  be  considered  when 
selecting  a  scale  for  the  y-axis  {28) . 

•  Choose  a  length  for  the  y-axis  that  is  suitably  proportional  to  that  of 
the  x-axis.   (A  common  recommendation  is  a  5:3  x: y-axis  ratio.) 

•  Identify  the  maximum  y-axis  value  and  round  the  value  up  slightly. 

•  Select  an  interval  size  that  provides  enough  detail  for  the  purpose  of 
the  graph. 

Scale  breaks  can  be  used  for  either  or  both  axes  if  the  range  of  the  data  is 
excessive.  However,  care  should  be  taken  to  avoid  misrepresentation  and 
misinterpretation  of  the  data  when  scale  breaks  are  used. 

Semi- logarithmic -scale  line  graphs 

A  semi- logarithmic-scale  line  or  semi-log  graph  is  characterized  by  one  axis  being 
measured  on  an  arithmetic  scale  (usually  the  x-axis)  and  the  other  being  measured  on  a 
logarithmic  scale.  A  logarithm  is  the  exponent  expressing  the  power  to  which  a  base 
number  is  raised  (e.g.,  log  100  =  log  102  =  2  for  base  10).  The  axis  portraying  the 
logarithmic  scale  on  semi-log  graph  paper  is  divided  into  several  cycles,  with  each 
cycle  representing  an  order  of  magnitude  and  values  10  times  greater  than  the 
preceding  cycle  (e.g.,  a  3-cycle  semi-log  graph  could  represent  1  to  10  in  the  first 
cycle,  10  to  100  in  the  second  cycle,  and  100  to  1,000  in  the  third  cycle). 

A  semi- logarithmic-scale  line  graph  is  particularly  valuable  when  examining  the  race 
of  change  in  surveillance  data,  because  a  straight  line  represents  a  constant  rate  of 
change.   For  absolute  changes,  an  arithmetic-scale  line  graph  would  be  more 
appropriate.  The  semi-log  scale  is  also  useful  when  large  differences  in  magnitude  or 
outliers  occur  because  this  type  of  graph  allows  the  plotting  of  wide  ranges  of  values 
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(see  Figure  1.6).   With  semi-log  graphs,  the  slope  of  the  line  indicates  the  rate  of 
increase  or  decrease;  thus  a  horizontal  line  indicates  no  change  in  rate.   Also, 
parallel  lines  for  two  conditions  demonstrate  identical  rates  of  change  [29)  . 

Histograms 

A  histogram  is  a  graph  in  which  a  frequency  distribution  is  represented  by  adjoining 
vertical  bars.  The  area  represented  by  each  bar  is  proportional  to  the  frequency  for 
that  interval  (i.e.,  the  height  multiplied  by  the  width  of  each  bar  yields  the  number 
of  events  for  that  interval) .   Thus,  scale  breaks  should  never  be  used  in  histograms 
because  they  misrepresent  the  data. 

Histograms  can  be  constructed  with  equal-  and  unequal-class  intervals.   Equal-class 
intervals  occur  when  the  height  of  each  bar  is  proportional  to  the  frequency  of  the 
events  in  that  interval.  We  do  not  recommend  using  histograms  with  unequal  class 
intervals  because  they  are  difficult  to  construct  and  interpret  correctly. 

The  epidemic  curve  is  a  special  type  of  a  histogram  in  which  time   is  the  variable 
plotted  on  the  x-axis.   The  epidemic  curve  represents  the  occurrence  of  cases  of  a 
health  problem  by  date  of  onset  during  an  epidemic,  (e.g.,  an  outbreak  of  paralytic 
poliomyelitis  in  Oman  [see  Figure  V.6]).  Usually  the  class  intervals  on  the  x-axis 
should  be  less  than  one- fourth  of  the  incubation  period  of  the  disease,  and  the 
intervals  should  begin  before  the  first  reported  case  during  the  epidemic  in  order  to 
portray  any  identified  background  cases  of  the  condition  being  graphed. 

Cumulative  frequency  and  survival  curves 

A  cumulative  frequency  curve  is  used  for  both  continuous  and  categorical  data.   It 
plots  the  cumulative  frequency  on  the  y-axis  and  the  value  of  the  variable  on  the  x- 
axis.  Cumulative  frequencies  can  be  expressed  either  as  the  number  of  cases  or  as  a 
percentage  of  total  cases.  For  categorical  data,  the  cumulative  frequency  is  plotted 
at  the  right-most  end  of  each  class  interval  (rather  than  at  the  mid  point)  to  depict 
more  realistically  the  number  or  percentage  of  cases  above  and  below  the  x-axis  value 
(Figure  V.7) .  When  percentages  are  graphed,  the  cumulative  frequency  curve  allows 
easy  identification  of  medians,  quartiles,  and  other  percentiles  of  interest. 

A  survival  curve  (Figure  V.8)  is  useful  in  a  follow-up  study  for  graphing  the 
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percentage  of  subjects  remaining  until  an  event  occurs  in  the  study.   The  x-axis 
represents  time,  and  the  y-axis  is  percentage  surviving.   A  difference  in  orientation 
exists  between  cumulative  frequency  and  survival  curves  (Figures  V.7,  V.8). 

Frequency  polygons 

A  frequency  polygon  is  constructed  from  a  histogram  by  connecting  the  midpoints  of  the 
class  intervals  with  a  straight  line.  A  frequency  polygon  is  useful  for  comparing 
frequency  distributions  from  different  data  sets  (Figure  V.9).  Detailed  instructions 
for  constructing  frequency  polygons  are  presented  elsewhere  (28,29)  . 

Charts 

Charts  are  useful  graphics  for  illustrating  statistical  information.   Many  types  of 
charts  can  be  used  [28-30)  .      They  are  most  suited  and  helpful  for  comparing  magnitudes 
of  events  in  categories  of  a  variable.   In  the  paragraphs  below,  we  describe  several 
of  the  most  frequently  used  types  of  charts. 

Bar  charts 

Bar  charts  are  one  of  the  simplest  and  most  effective  ways  to  present  comparative 
data.   A  bar  chart  uses  bars  of  the  same  width  to  represent  different  categories  of  a 
factor.   Comparison  of  the  categories  is  based  on  linear  values  since  the  length  of  a 
bar  is  proportional  to  the  frequency  of  the  event  in  that  category.   Therefore,  scale 
breaks  could  cause  the  data  to  be  misinterpreted  and  should  not  be  used  in  bar  charts. 
Bars  from  different  categories  are  separated  by  spaces  (unlike  the  bars  in  a 
histogram).  Although  most  bars  are  vertical,  they  may  be  depicted  horizontally.   They 
are  usually  arranged  in  ascending  or  descending  length,  or  in  some  other  systematic 
order . 

Several  variations  of  the  bar  chart  are  commonly  used.  The  grouped  or  multiple-unit 
bar  chart  compares  units  within  categories  (Figure  V.10).   Generally  the  number  of 
units  within  a  category  is  limited  to  three  for  effective  presentation  and 
understanding . 

A  stacked  bar  chart  is  also  used  to  compare  different  groups  within  each  category  of  a 
variable.  However,  it  differs  from  the  grouped  bar  chart  in  that  the  different  groups 
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are  differentiated  not  with  separate  bars,  but  with  different  segments  within  a  single 
bar  for  each  category.   The  distinct  segments  are  illustrated  by  different  types  of 
shading,  hatching,  or  coloring,  which  are  defined  in  a  legend  (Figure  V.ll). 

The  deviation  bar  chart  illustrates  differences  in  either  direction  from  a  baseline. 
This  type  of  chart  is  especially  useful  for  demonstrating  positive-negative  and 
profit-loss  data  or  comparisons  of  data  at  different  times  (Figure  V.12).   The 
incorporation  of  a  confidence  interval-like  portion  in  the  bars  provides  additional 
useful  information. 

Pie  charts 

A  pie  chart  represents  the  different  percentages  of  categories  of  a  variable  by 
proportionally  sized  pieces  of  pie  (Figure  V.13) .  The  pieces  are  usually  denoted  with 
different  colors  or  shading,  and  the  percentages  are  written  inside  or  outside  the 
pieces  to  allow  the  reader  to  make  accurate  comparisons. 

Maps 

Maps  are  the  graphic  representation  of  data  using  location  and  geographic  coordinates 
(33) .   A  map  generally  provides  a  clear,  quick  method  for  grasping  data  and  is 
particularly  effective  for  readers  who  are  familiar  with  the  physical  area  being 
portrayed.   A  few  popular  types  of  maps  that  depict  incidence  or  distribution  of 
health  conditions  are  described  below. 

Spot  maps 

A  spot  map  is  produced  by  placing  a  dot  or  other  symbol  on  the  map  where  the  health 
condition  occurred  or  exists  (Figure  V.14).   Different  symbols  can  be  used  for 
multiple  events  at  a  single  location.   Although  a  spot  map  is  beneficial  for 
displaying  geographic  distribution  of  an  event,  it  does  not  provide  a  measure  of  risk 
since  population  size  is  not  taken  into  account. 

Chloropleth  maps 

A  chloropleth  map  is  a  frequently  used  statistical  map  involving  different  types  of 
shading,  hatching,  or  coloring  to  portray  range-graded  values  (Figure  V.15).  It  is 
also  called  a  shaded  or  area  map.  Chloropleth  maps  are  useful  for  depicting  rates  of 
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a  health  condition  in  specific  areas. 

Care  must  be  taken  in  interpreting  chloropleth  maps  because  each  area  is  shaded 
uniformly  regardless  of  any  demographic  differences  within  an  area.   For  example,  most 
of  a  county  may  be  relatively  sparsely  populated  by  low-income  persons,  where  as  a 
small  portion  of  that  county  may  be  densely  inhabited  by  persons  with  higher  incomes; 
and  the  rate  at  which  a  particular  health  condition  occurs  may  falsely  appear  to  be 
evenly  distributed  by  location  and  by  socioeconomic  status  throughout  the  county. 
Chloropleth  maps  can  also  give  the  false  impression  of  abrupt  change  in  number  or  rate 
of  a  condition  across  area  boundaries  when,  in  fact,  a  gradual  change  may  have 
occurred  from  one  area  to  the  next. 

Density-equalizing  maps 

A  density-equalizing  or  rubber  map  (Figure  V.16)  transforms  actual  geographic 
coordinates  to  produce  an  artificial  figure  in  which  area  or  population  density  is 
equal  throughout  the  map  (34)  .     Density-equalizing  maps  correct  for  the  confounding 
effect  of  population  density  and  thus  are  particularly  useful  in  analyzing  geographic 
clusters  of  public  health  events. 

Several  algorithms  exist  to  transform  coordinates  of  maps.   Any  transformation  routine 
should  define  a  continuous  transformation  over  the  map  domain,  solve  for  the  unique 
solution  that  minimizes  map  distortion,  accept  optional  constraints,  and  avoid 
overlapping  of  transformed  areas  (35)  . 

INTERPRETATION  OF  SURVEILLANCE  DATA 

The  real  art  of  conducting  surveillance  lies  in  interpreting  what  the  data  say.   Data 
need  to  be  interpreted  in  the  context  of  our  understanding  of  the  etiology, 
epidemiology,  and  natural  history  of  the  disease  or  injury.   The  interpretation  should 
focus  on  aspects  which  might  lead  to  improved  control  of  the  condition.   By  proceeding 
from  the  simple  to  the  complex,  investigators  can  use  surveillance  as  a  basis  for 
taking  appropriate  public  health  action.   Epidemics  can  be  recognized,  preventive 
strategies  applied,  and  the  effect  of  such  actions  can  be  assessed.  The  key  to 
interpretation  lies  in  knowing  the  limitations  of  the  data  and  being  meticulous  in 
describing  them.   One  axiom  to  be  kept  in  mind  always  is  that,  because  of  the 


145 
descriptive  nature  of  surveillance  data,  correlation  does  not  eijual  causation. 

Limitations  in  Data 

No  surveillance  system  is  perfect;  however,  most  can  be  useful.   Several  problems 
inherent  in  data  obtained  through  surveillance  must  be  recognized  if  the  data  are  to 
be  interpreted  correctly. 

Uncle  rrepor  t  ing 

Because  most  surveillance  systems  are  based  on  conditions  reported  by  health-care 
providers,  underreporting  is  inevitable.  Depending  on  the  condition,  5%-80%  of  cases 
that  actually  occur  will  be  reported  [36-39) .      However,  the  need  for  completeness  of 
reporting—particularly  for  common  health  problems--may  be  exaggerated.   Disease 
trends  by  time,  place,  and  person  can  frequently  be  detected  even  with  incomplete 
data.  So  long  as  the  underreporting  is  relatively  consistent,  incomplete  data  can 
still  be  applied  to  derive  useful  inferences.  For  problems  that  occur  infrequently, 
the  need  for  completeness  becomes  more  important. 

Unrepresentativeness  of  reported  cases 

Health  conditions  are  not  reported  randomly.  For  example,  illnesses  dealt  with  in  a 
public  health  facility  are  reported  disproportionately  more  frequently  than  those 
diagnosed  by  private  practitioners.  A  health  problem  that  leads  to  hospitalization  is 
more  likely  to  be  reported  than  problems  dealt  with  on  an  outpatient  basis.  Thus, 
reporting  biases  can  distort  interpretation.  When  it  is  possible,  adjusting  for 
skewed  reporting  will  allow  investigators  to  obtain  a  more  accurate  picture  of  the 
occurrence  of  a  health  problem.  Collecting  data  from  multiple  sources  may  help 
provide  ways  to  improve  the  representativeness  of  the  information. 

Inconsistent  case  definitions 

Different  practitioners  frequently  use  different  case  definitions  for  health  problems. 
The  more  complex  the  diagnostic  syndrome,  the  greater  the  difficulty  in  reaching 
consensus  on  a  case  definition.   Moreover,  with  newly  emerging  problems,  as 
understanding  of  their  natural  history  progresses,  we  frequently  adjust  the  case 
definition  to  allow  greater  accuracy  of  diagnosis.   Persons  who  interpret  surveillance 
data  must  be  aware  of  any  changes  in  case  definitions  and  must  adjust  their 
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interpretations  accordingly. 

Approach  to  Interpretation 

Creative  interpretation  of  surveillance  data  requires  more  common  sense  than 
sophisticated  reasoning.  The  data  can  speak  for  themselves.  Brainstorm  and  test,  if 
possible,  all  potential  explanations  for  an  observed  pattern.   Has  the  nature  of 
reporting  changed?  Have  providers  or  new  geographic  areas  entered  the  surveillance 
system?  Has  the  case  definition  changed?  Has  a  new  intervention,  such  as  screening 
or  therapy,  been  introduced? 

Consistency  among  different  surveillance  systems  is  probably  the  most  crucial  factor 
affecting  interpretation.   If  different  surveillance  data  sets  from  different 
locations  show  similar  trends,  the  likelihood  that  the  effect  is  real  increases. 
Examine  trends  in  different  age  groups.   Finally,  choose  the  surveillance  system  you 
think  represents  the  highest  quality  local  information.   If  the  trends  of  the  health 
problem  are  evident  there,  you  can  be  more  confident  about  your  interpretations. 

To  facilitate  interpretation  of  surveillance  data,  formats  can  be  designed  to 
determine  whether  the  number  of  reported  cases  of  a  health  problem  for  a  specified 
reporting  period  differs  from  that  of  a  previous  period.  An  example  of  such  a  "user- 
friendly"  format  has  been  published  in  CDC's  Morbidity  and  Mortality  Weekly  Report 
(MMWR)    since  1990  {40,41)  .   Known  simply  as  "Figure  1,"  the  graph  uses  horizontal  bars 
to  indicate  the  ratio  of  the  current  level  of  disease  to  the  previous  5-year  average 
(Figure  V.12).   Striping  in  the  bars  shows  whether  the  number  of  reported  cases  during 
the  most  recent  4-week  interval  are  higher  or  lower  than  the  expected  based  on  the 
mean  and  two  standard  deviations  of  the  4-week  totals.  A  change  in  the  occurrence  of 
disease  identified  by  this  approach  indicates  the  need  for  more  detailed  examination 
of  the  data--and  may  indicate  an  epidemic.  Other  diverse  statistical  techniques  can 
be  used  to  detect  aberrations  in  surveillance  data  (42;    see  Chapter  VI). 

INTERPRETIVE   USES    FOR   SURVEILLANCE    DATA 

Identifying  Epidemics 

An  important  use  of  surveillance  data  is  in  determining  whether  increases   in  numbers 
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of  cases  of  a  health  condition  at  the  local  or  national  level  represent  outbreak 
(i.e.,  epidemic)  situations  that  require  immediate  investigation  and  intervention. 
Thus,  a  surveillance  system  can  function  as  an  early  warning  signal  for  public  health 
officials.   For  example,  increases  in  numbers  of  cases  of  hepatitis  B  among  military 
recruits  provided  the  stimulus  to  intervene  with  drug-prevention  programs  (43).      CDC's 
Birth  Defects  Monitoring  System  identified  increases  in  renal  agenesis  (44)    during  the 
1970s  and  1980s,  which  prompted  an  investigation.   Monitoring  of  regional  trends  in 
rubella  and  congenital  rubella  identified  outbreaks  among  the  Amish  in  1989-1990  (45)  . 
A  national  registry  of  anti-abortion-associated  violence  clearly  documented  an 
"epidemic"  of  attacks  in  the  mid-1980s,  which  decreased  after  vigorous  prosecution  was 
initiated  by  the  Federal  Bureau  of  Investigation  (46)  . 

The  utility  of  surveillance  data  in  detecting  epidemics  is  highest  in  situations  in 
which  cases  of  the  health  condition  occur  over  a  wide  geographic  area  or  gradually 
over  time.   In  such  situations,  the  time-place-person  links  among  cases  probably  would 
not  be  recognized  by  individual  practitioners  (3) .      Typical  examples  occur  with 
infectious  diseases,  when  laboratory  monitoring  of  unusual  serotypes  or  antibiotic- 
resistance  patterns  identify  outbreaks  of  specific  microorganisms  that  might  otherwise 
have  gone  unnoticed.   Nationwide  epidemics  of  Salmonella  newport    (47)  ,    S.    enteritidis 
(48),    and  Shigella  sonnei    (49)    have  been  detected  through  surveillance. 

Identifying  New  Syndromes 

The  most  dramatic  use  of  surveillance  data  occurs  when  a  "new"  syndrome  emerges  from 
an  ongoing  monitoring  system.  Legionnaire's  disease  was  detected  and  subsequently 
characterized  as  the  result  of  an  outbreak  of  non-influenza  pneumonia  within  a 
specific  place  and  population  (50) .     Acquired  immunodeficiency  syndrome  (AIDS)  was 
recognized  both  because  of  rapid  increases  in  requests  for  CDC's  pentamidine  supply 
and  because  it  occurred  in  a  special  time  (early  1981),  place  (California,  New  York), 
and  person  (men  having  sex  with  men)  setting  (51) .      Finally,  the  national  scope  of  the 
epidemic  of  eosinophilia  myalgia  syndrome  (EMS)  was  noticed  because  its  unique 
features  were  like  those  of  toxic  oil  syndrome  (52) . 

Monitoring  Trends 

Even  if  specific  outbreaks  or  new  syndromes  cannot  be  identified  by  tracking 
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surveillance  data,  the  baseline  level  of  the  health  condition  being  monitored  reflects 
any  variation  in  its  occurrence  over  time.   This  purpose  is  especially  relevant  to 
assessing  events  associated  with  reproductive  health  (e.g.,  ectopic  pregnancy  or 
neonatal  mortality),  chronic  disease,  or  infections  with  a  long  latency.   The 
progressive  decline — until  recently — of  tuberculosis  in  the  20th  century  and  the 
constant  increase  in  numbers  of  cases  of  AIDS  throughout  the  1980s  reflect  this 
monitoring  function  (53,54)  . 

Evaluating  Public  Policy 

Surveillance  data  can  assess  the  health  impact--pro  or  con--of  specific  interventions 
or  of  public  policy.  The  rapid  fall  in  numbers  of  cases  of  poliomyelitis  and  measles 
after  national  vaccination  campaigns  were  instituted  is  a  classic  example  of  the 
usefulness  of  surveillance  data  {55,56).      Creative  interpretation  of  surveillance  data 
has  also  been  applied  to  non-infectious-conditions;  the  impact,  in  such  situations,  is 
somewhat  more  difficult  to  assess.  For  example,  in  Washington,  D.C.,  the  adoption  of 
a  gun-licensing  law  coincided  with  an  abrupt  decline  in  firearm-related  homicides  and 
suicides  (57) .  No  similar  reductions  occurred  in  the  number  of  homicides  or  suicides 
committed  by  other  means,  nor  did  states  adjacent  to  the  District  experience  any 
reductions  in  their  rates  of  firearm-related  homicides  or  suicides.   Also, 
surveillance  of  legal  abortions  and  of  deaths  associated  with  illegal  abortion  has 
helped  trace  the  public  health  impact  of  this  controversial  health  problem  (8 ,58, 59)  . 
After  legal  abortion  became  widely  available,  deaths  from  illegal  abortion  decreased 
markedly;  however,  restriction  of  federal  funds  for  abortion  had  a  negligible  effect 
on  health  parameters  (60)  . 

Though  it  is  tempting  to  use  trends  in  disease  and  injury  to  monitor  the  impact  of 
community  interventions,  such  evaluation  becomes  increasingly  suspect  when  several 
factors  contribute  to  the  occurrence  of  disease  or  health  condition  being  monitored. 
In  addition,  if  only  a  portion  of  the  population  accepts  an  intervention,  analysis  and 
interpretation  of  surveillance  data  are  made  even  more  difficult.  Frequently, 
surveillance  of  process  measures  or  other  health  problems  can  act  as  proxies  for  the 
intended  outcome.   Moreover,  finding  comparability  in  data  from  several  populations 
that  have  attempted  similar  public  health  programs  strengthens  evidence  that  the 
interpretation  is  correct.   For  example,  to  evaluate  the  effectiveness  of  allowing 
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people  to  exchange  used  hypodermic  needles  for  new  ones  as  a  means  of  preventing  AIDS, 
epidemiologists  could  simultaneously  examine  trends  in  numbers  of  needles  distributed, 
surveys  of  needle  use,  and  incidence  of  higher- prevalence  infections  such  as  hepatitis 
B. 

Projecting  Future  Needs 

Mathematical  models  based  on  surveillance  data  can  be  used  to  project  future  trends. 
This  tool  helps  health  officials  determine  the  eventual  need  for  preventive  and 
curative  services.  Recently  such  modelling  assisted  in  estimating  the  impact  of  AIDS 
on  the  United  States  health-care  system  in  the  1990s  (61)  .  Hot  only  did  such 
projections  address  the  demand  for  AZT  by  HIV-infected  persons  with  low  CD-4 
lymphocyte  counts,  but  also  the  requirements  for  hospital  care  for  persons  with  life- 
threatening  superinfections  later  in  the  course  of  HIV-related  disease.   In  addition, 
models  based  on  surveillance  data  can  predict  the  decline  of  morbidity  and/or 
mortality  when  there  are  changes  in  risk  factors  among  the  population  at  risk. 
Examples  of  this  application  include  projecting  the  decline  in  cardiovascular  disease 
on  the  basis  of  decreased  smoking  of  cigarettes  (62),    the  decline  in  cirrhosis-related 
mortality  in  the  presence  of  lower  levels  of  alcohol  use  (63),    and  decreased  rates  of 
mortality  from  cervical  cancer  associated  with  an  increase  in  the  prevalence  of 
hysterectomy  (64). 
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Chapter  VI 


Special  Analytic   Issues 


Donna  F.  St roup 


•There  is  only  one  good,  that  is  knowledge.  There  is  only  one  evil,  that  is 
ignorance . ■ 

Socrates 


NATURE  OF  PUBLIC  HEALTH  SURVEILLANCE  DATA 

Data  obtained  in  a  public  health  surveillance  system  have  several  characteristics  that 
affect  analyses.  Most  fundamentally,  data  from  most  surveillance  systems  are  not 
generated  from  a  designed  study  or  randomized  trial.  Although  this  departure  has  been 
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addressed  in  the  context  of  epidemiologic  studies  and  field  investigations  (1)  ,    the 
effect  in  the  surveillance  setting  has  specific  consequences. 

First,  for  a  surveillance  system,  data  are  reported  regularly,  and  may  be  updated 
after  the  initial  report.   Since  the  lag  time  between  first  report  and  subsequent 
updating  may  vary  by  health  event  or  reporting  location,  methods  developed  for  early 
detection  of  aberrations  in  the  data  should  be  applied  as  soon  as  provisional  data  are 
available.   If  the  analyses  are  implemented  as  part  of  a  routine  surveillance  program, 
results  can  be  monitored  as  data  are  updated. 

Second,  surveillance  data  are  generated  by  a  spatial  as  well  as  a  temporal  process. 
For  example,  at  a  given  point  in  time,  cases  of  a  disease  for  a  given  area  may  not 
appear  excessive;  however,  when  compared  with  other  times  or  other  areas  at  a  given 
time,  an  excess  may  become  apparent  (2) . 

Third,  when  only  aggregated  data  are  available  (e.g.,  from  regions,  counties,  or 
states) ,  the  distribution  of  cases  in  the  underlying  population  cannot  be  assessed 
directly.  This  problem  is  compounded  because  the  areas  of  aggregation  are  usually 
arbitrarily  defined  and  case  definitions  are  not  consistent  within  areas.   As  a 
result,  statistical  inferences  concerning  the  properties  of  individuals  are  confounded 
by  the  properties  of  the  aggregated  system. 

Finally,  the  surveillance  process  is  generally  a  multivariate  one  (3) .  Multiple 
health  events  under  surveillance  may  be  related  for  a  given  point  in  time  for  the  same 
area,  or  the  relationship  may  be  delayed  in  time  for  the  same  or  nearby  areas  if 
diagnosis  is  uncertain  or  confirmation  is  delayed.  The  multivariate  nature  of  this 
process  should  be  used  to  improve  the  ability  of  any  method  to  detect  aberrations  from 
a  baseline. 

CLUSTERING  OF  HEALTH  EVENTS 

One  foundation  of  the  science  of  epidemiology  is  the  study  of  the  departure  of  the 
observed  patterns  of  the  occurrence  of  disease  from  the  expected  pattern  of  occurrence 
(4)  .      Variations  in  the  usual  incidence  of  health  events  in  different  geographic  areas 
or  different  time  periods  may  provide  important  clues  to  specific  risk  factors  or  even 
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to  the  etiology  of  the  problem.   The  expected  numbers  of  reported  health  events  are 
generated  by  a  process  involving  human  behavior  and  transmission  of  disease,  and 
patterns  of  occurrence  within  human  populations  may  lead  to  hypotheses  about  the 
determinants  of  the  health  problem  (5) . 

The  public  health  community  continues  to  struggle  with  nomenclature  for  such 
variations.  The  term  "cluster"  can  be  defined  as  "a  set  of  events  occurring  unusually 
close  together  to  each  other  in  time  or  space,  in  both  time  and  space,  or  within  the 
limits  of  demographic  characteristics  (e.g.  persons  in  the  same  occupation)." 
•Cluster"  is  usually  used  to  describe  uncommon  events  (e.g.,  leukemia,  suicide)  and 
tends  to  evoke  emotional  response  from  members  of  the  public  or  from  the  media. 

A  related  term  is  "epidemic" ,    historically  used  to  describe  aggregation  of  infectious 
diseases:  "an  outbreak  of  a  disease  spreading  rapidly  from  person  to  person"  {6). 
More  recently,  the  concept  has  broadened  to  the  following:  "the  occurrence  in  a 
community  or  region  of  cases  of  an  illness,  specific  health-related  behavior,  or  other 
health-related  events  clearly  in  excess  of  normal  expectancy  ....   The  number  of 
cases  indicating  the  presence  of  an  epidemic  will  vary  according  to  agent,  size  and 
type  of  population  exposed,  previous  experience  or  lack  of  exposure  to  the  disease  and 
time  and  place  of  occurrence;  thus,  epidemicity  is  relative  to  the  usual  frequency  of 
the  disease  in  the  same  area,  among  the  specified  population,  at  the  same  season  of 
the  year"  (7).   it  is  prudent  to  be  conscious  of  the  fact  that  the  term  "epidemic" 
evokes  responses  beyond  these  definitions.   In  late  1988,  the  British  Public  Health 
Laboratory  Service  used  "epidemic"  to  describe  an  increase  in  reported  numbers  of 
cases  of  Salmonella  enteritidis   associated  with  contaminated  chicken  and  eggs.   The 
country's  Chief  Medical  Officer,  Sir  Donald  Acheson,  advised  caution  "...in  using  the 
word  epidemic  when  addressing  the  public  because  of  its  connotations  with  terrifying 
diseases  such  as  cholera  and  smallpox"  (8)  .  The  term  "outbreak"   has  less  evocative 
connotations.  With  all  such  definitions,  a  critical  concept  is  the  comparison  of  an 
observed  number  with  what  is  usual  or  normal.   The  distinction  made  here  is  that 
•aberration"  will  be  used  to  denote  changes  in  the  occurrence  of  health  events  that 
are  statistically  significant  when  compared  with  usual  or  normal  history.  The 
definition  of  an  epidemic  may  require  the  existence  of  an  aberration;  e.g.,  the 
Centers  for  Disease  Control  (CDC)  declares  that  an  epidemic  of  a  specific  strain  of 
influenza  is  occurring  only  if  the  number  of  reported  deaths  exceeds  a  95%  confidence 
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limit  in  the  forecast  for  two  or  more  consecutive  periods.   In  general,  application  of 
the  term  "epidemic*  may  require  epidemiologic  conditions  beyond  the  statistical  ones, 
e.g.,  laboratory  isolates  or  resistance  to  vaccine. 

In  this  chapter,  'aberration'   is  used  to  describe  statistical  departures  from  a  usual 
distribution.   It  is  important  to  understand  that  such  departures  do  not  necessarily 
signal  the  "onset  of  an  epidemic"  or  the  "presence  of  a  cluster."   Conversely,  one  can 
have  an  epidemic  even  in  the  absence  of  a  statistical  increase,  such  as  when  infant 
mortality  is  "low"  but  still  higher  than  expected.   The  methods  developed  here  are 
intended  for  routine  use  by  the  public  health  analyst,  in  conjunction  with 
epidemiologic  investigation  and  close  communication  with  the  source  of  the 
surveillance  reports. 

ABERRATIONS  IN  TIME 

Since  the  definition  of  surveillance  implies  ongoing  data  collection,  perhaps  the  most 
fundamental  question  suggested  by  the  analysis  of  a  surveillance  system  is  the 
following:  When  does  the  value  of  reported  events  signal  a  change  in  the  process  from 
past  patterns?  Although  fundamental,  the  analysis  required  to  address  this  question 
suggests  additional  questions.   How  are  "past  patterns"  defined?   If  an  outbreak 
occurred  in  the  past,  should  this  affect  the  definition  of  a  change?  Other  than  the 
disease  or  injury  process  itself,  what  other  factors  could  cause  a  change? 

In  the  paragraph  below,  we  use  the  terms  "baseline"  to  denote  historical  data  and 
■current  report"  to  denote  the  recent  data  on  which  the  assessment  is  based. 

Graph  of  Current  and  Past  Experience 

State  health  departments  report  the  numbers  of  cases  of  about  50  notifiable  diseases 
each  week  to  CDC's  National  Notifiable  Diseases  Surveillance  System  (NNDSS) .  The  list 
of  health  events  is  determined  collaboratively  by  the  Council  of  State  and  Territorial 
Epidemiologists  and  CDC  {9,10).      Each  week  provisional  reports  are  published  in  the 
Morbidity  and  Mortality  Meekly  Report    (MMWR)    and  are  made  available  to 
epidemiologists,  clinicians,  and  other  public  health  professionals  in  a  timely  manner. 
Although  the  tables  of  the  MMWR  continue  to  provide  important  information,  the  volume 
of  data  and  the  need  for  ease  of  interpretation  encouraged  the  development  of  a 
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graphic  display  to  highlight  unusually  high  or  low  numbers  of  reported  cases. 

A  new  analytic  and  graphical  method  was  adopted  for  this  system  to  achieve  the 
following  objectives:   a)  to  portray  in  a  single  comprehensible  figure  the  weekly 
reports  of  data  for  approximately  20  diseases  and  to  compare  those  data  with  past 
results  b)  to  highlight  for  further  analysis  the  results  most  likely  to  reflect  either 
long-term  trends  or  epidemics.  These  objectives  were  formulated  to  reflect  most 
recent  behavior  in  as  short  a  time  period  as  possible  for  weekly  publication,  but  a 
long  enough  period  to  assure  stable  results.  To  facilitate  comprehension,  the  same 
method  is  used  for  all  diseases  portrayed. 

The  analytic  method  currently  used  for  constructing  Figure  I  in  the  MMWR   (see 
Figure  VI. 12),  called  the  °CDC  MMWR  Current/Past  Experience  Graph    (CPEG),°   compares 
the  number  of  reported  cases  in  the  current  4-week  period  for  a  given  health  event 
with  historical  data  on  the  same  condition  from  the  preceding  5  years  (11,12). 
Numbers  of  cases  in  the  current  month  are  listed  to  facilitate  interpretation  of 
instability  caused  by  small  numbers. 

The  choice  of  4  weeks  as  the  "current  period"  was  based  on  evidence  that  weekly 
fluctuation  in  data  from  disease  reports  usually  reflects  irregular  reporting 
practices  rather  than  actual  incidence  of  disease.  The  use  of  5  years  of  history 
achieves  the  objective  of  using  the  same  model  for  all  conditions  portrayed,  since 
some  health  events  were  made  notifiable  only  recently  (e.g.,  acquired  immunodeficiency 
syndrome  (AIDS)  and  legionellosis) . 

Also,  modelling  of  reported  influenza  incidence  has  shown  that  more  accurate  forecasts 
are  based  on  more  recent  data  (13) .  To  increase  the  historical  sample  size  and  to 
account  for  any  seasonal  effect,  the  baseline  is  taken  to  be  the  average  of  the 
reported  number  of  cases  for  the  preceding  4-week  period,  the  corresponding  4-week 
period,  and  the  following  4-week  period,  for  the  previous  5  years.  This  yields  15 
correlated  observations,  referred  to  as  the  historical  observations,  or  "baseline" 
(Figure  VI.  1)  . 

The  deviation  from  unity  of  the  ratio  of  the  current  4-week  total  to  the  historical 
average  is  indicative  of  a  departure  from  past  patterns.   We  plot  this  ratio  on  a 
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logarithmic  scale  so  that  an  n-fold  increase  projects  to  the  right  the  same  distance 
as  an  n-fold  decrease  projects  to  the  left,  and  no  change  from  past  patterns  (1:1) 
produces  a  bar  of  zero  length  (14)  .      To  distinguish  the  conditions  that  may  require 
further  investigation,  the  hatching  on  the  bars  begins  at  a  point  based  on  the  mean 
and  standard  deviation  of  the  historical  observations.* 

An  evaluation  of  this  method  shows  that  it  has  good  statistical  robustness  to  patterns 
in  the  data  and  high  sensitivity  and  predictive  value  positive  for  epidemiologically 
confirmed  outbreaks  {15).     An  outbreak  of  rubella  detected  by  this  method  proved  to  be 
of  substantial  public  health  importance  (16).      Recent  increases  beyond  historical 
limits  in  reporting  of  aseptic  meningitis  reflected  increased  disease  activity 
primarily  in  the  northeastern  United  States  (17). 

TIME- SERIES  METHODS 

The  method  used  by  CDC  to  estimate  excess  mortality  associated  with  influenza  was 
developed  from  a  1932  study  that  defined  the  expected  number  of  weekly  deaths  from 
pneumonia  and  influenza,  or  from  all  causes,  as  the  median  number  of  deaths  for  a 
given  week  during  non-epidemic  years  (18)  .      "Excess  deaths, "  then,  was  defined  as  the 
difference  between  the  observed  and  the  conditional  expected  numbers,  a  one-period- 
ahead  forecast.   Later,  a  regression  model  was  fitted  to  weekly  pneumonia  and 
influenza  data  from  U.S.  cities  to  calculate  an  expected  number  of  deaths  (19).      In 
1979,  CDC  proposed  a  new  method  to  estimate  expected  deaths  using  a  body  of  methods 
called  time-series  (20).      More  recently,  a  method  forecasting  separate  expected 
numbers  by  age  group  has  been  investigated  (13) . 

The  methodology  of  time  series  is  appropriate  for  data  available  sequentially  over 
time.  A  time-series  model  generally  comprises  components  estimating  the  effect  of 
secular  trend,  cycles,  or  year-to-year  seasonal  patterns.  The  process  of  model 
fitting  consists  of  identification,  estimation,  and  diagnostic  validation.   One  then 
evaluates  competing  models  on  the  basis  of  the  fit  of  the  models  to  the  observed  data 
and  of  the  accuracy  of  the  forecasts. 


♦Historical  limits  of  the  ratio  of  current  reports  to  the  historical  mean  are  calculated 
as  1  plus  or  minus  2  times  the  standard  deviation  divided  by  the  mean,  where  the  mean  and 
the  standard  deviation  are  calculated  from  the  15  historical  4-week  periods. 
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Most  common  methods  of  time-series  analysis,  such  as  the  Auto  Regressive  Integrated 
Moving  Average  (ARIMA)  models  (21),    are  appropriate  for  relatively  long  series  of  data 
that  exhibit  certain  regular  properties  over  the  entire  series.  Differencing,  or 
forming  a  new  series  by  subtracting  adjacent  observations,  is  generally  used  to  create 
a  series  with  a  stationary  mean,  that  is  without  trend.  An  additional  property, 
stationarity  of  the  variance,  is  generally  required,  so  that  the  process  does  not 
become  more  or  less  variable  over  time.  An  autoregressive  model  includes  terms  that 
model  the  data  at  one  point  in  time  as  a  function  of  previous  data.  A  moving-average 
term  creates  a  series  from  averages  of  adjacent  observations  and  is  used  to  model 
cycles  in  the  data. 

The  advantage  of  time-series  models  for  surveillance  over  other  modeling  methods,  such 
as  regression,  is  that  the  estimation  process  accounts  for  period- to-period 
correlations  and  seasonality,  as  well  as  long-term  secular  trends.  A  more  detailed 
description  of  the  concepts  used  in  time  series  has  been  described  (21) . 

Scan  Statistic 

Consider  this  surveillance  question:   Is  the  number  of  cases  reported  for  a  certain 
time  period  excessive?  While  ARIMA  time-series  methods  provide  one  approach  to  the 
answer,  often  the  mechanics  of  this  analysis  are  complex.   The  scan  statistic  (22) 
offers  a  relatively  simple  alternative  in  this  situation.   The  scan  statistic  is  the 
maximum  number  of  reported  cases  (i.e.,  events)  in  an  interval  of  predetermined  length 
over  the  time  frame  of  interest.   It  is  used  to  test  the  null  hypothesis  of  uniformity 
of  reporting  against  an  alternative  of  temporal  clustering.  Consider  the  following 
setting.  Surveillance  data  are  reported  over  a  time  period  T,  containing  k   intervals 
of  equal  length: 

nx     n2     ...  n* 

j ! ! I | ! ! I L 

tj.    t,  tfc 

T 
Where       ti(  i=  1,  2,  ...,  k  are  of  equal  length  t 
and         T  =  tj  +  t2  +  ...  +  tk. 

The  total  number  of  events  reported  in  the  entire  time  period  is  called  N  and  is  the 
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sum  of  the  numbers  of  events  in  each  of  the  intervals  n2  +  n2  +  ...  +  nt.  Let  n  =  max 
{n^}  ,  i=  1,2,  ...,  k,  or  the  largest  report  in  any  of  the  intervals.   Then  compute  L  = 
T/t,  or  the  number  of  intervals  in  the  entire  time  period. 

The  statistical  question  addressed  by  the  scan  statistic  is:  What  is  the  probability 
that  the  maximum  number  of  cases  in  any  interval  of  length  t  is  equal  to  or  exceeds  n? 

For  example  if  the  frequency  of  trisomies  among  karyotyped  spontaneous  abortions  for  a 
defined  geographic  area  by  calendar  month  of  last  menstrual  period  in  1992  are  as 
follows: 


Month 

Number 

of 

cases 

Month 

Number 

of  cases 

January 

1 

July 

2 

February 

3 

August 

4 

March 

2 

September 

4 

April 

2 

October 

2 

May 

4 

November 

3 

June 

3 

December 

10 

What  is  the  probability  of  10  or  more  trisomies  in  December  given  there  were  a  total 
of  40  in  1992?   Using  the  notation  defined  above,  N  =   40,  T   =  12;  L  =    12/1  =12;  n   = 
10;  and  t=  1.   Then  from  tabulated  values  {23)    the  probability  of  10  or  more  trisomies 
in  December,  given  40  for  the  year,  is  0.083. 


N 

35 

40 
40 
40 

45 


L=  8 

n  p 

14  0.002 

13  0.040 

14  0.012 

15  0.003 

14  0.042 


L=  12 

n  p 

11  0.007 

10  0.083 

11  0.024 

12  0.006 

11  0.064 


15 


n  p 

10  0.007 

9  0.082 

10  0.021 

11  0.005 

10  0.053 


If  the  results  of  the  scan  statistic  are  to  be  useful,  the  lengths  of  the  entire  time 
frame  and  the  scanning  interval  must  be  determined  a  priori .     The  lack  of  extensive 
tabulated  values  and  the  computer- intensive  calculations  for  large  sample  sizes  limit 
the  usefulness  of  the  method.  Approximations  to  the  exact  distribution  are  described 
elsewhere  {23-25) . 
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ABERRATIONS  IN  SPACE  AND  TIME 

Given  cases  of  a  health  event  reported  from  a  defined  geographic  area  over  a  defined 
time  period,  can  we  say  that  the  cases  occur  unusually  close  together  in  both  space 
and  time?  That  is,  do  they  form  a  spatial-temporal  cluster?  Traditional  approaches 
to  the  analysis  of  health-event  aggregation  in  geographic  areas  have  been  based  on 
randomization  arguments  {26-27).     A   representative  discussion  follows. 

One  proposed  method  divides  the  study  area  into  subareas  (e.g.,  counties  or  census 
tracts)  and  the  study  time  period  into  intervals  of  constant  length  (e.g.,  month  or 
year)  (28)  .      The  cases  of  the  health  event  for  each  time-space  "cell"  are  then 
calculated.   The  maximum  count  within  any  time  interval  is  summed  across  all  subareas 
to  obtain  a  test  statistic.  This  method  assumes  equal  population  density  across  all 
area  cells  and  has  limitations  {29) . 

In  Knox's  method,  all  possible  pairs  of  cases  are  examined,  and  each  pair  is 

classified  according  to  whether  the  case-patients  in  the  pair  lived  "close"  together 

and  had  onset  of  the  health  problem  (or  report)  "close"  in  time,  resulting  in  the  2- 

by-2  table: 

Reports  close  in  time? 

Yes    No 
Reports  close         Yes  a      b 

in  space?  No  c      d 

Under  the  hypothesis  of  no  clustering,  the  expected  number  may  be  calculated  in  the 
usual  way,  with  an  adjustment  in  the  significance  test,  since  the  statistic  is  based 
on  pairs  of  cases  (30).     A  brief  example  follows. 

Consider  cases  of  a  disease  with  the  following  spatial  and  temporal  relationships: 

Close  in  space? 

Yes     No    All 

Yes  1  5 

Close  in  time?      No  2  3 

All  6      22     28 

The  test  statistic  to  be  computed  is  X   =  number  of  pairs  close  in  space  and  time,  1  in 
this  example.  We  use  row  and  column  marginal  totals  to  compute  an  expected  value  for 
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this  cell:   (6x5)  /  28  =  1.07.  Now  use  the  Poisson  distribution  to  compute  the 
probability  of  seeing  one  (or  more)  cases  close  in  space  and  time,  given  that  we 
expect  1.07;  this  value  is  at  least  0.63.   Therefore,  we  conclude  that  these  data 
provide  no  evidence  for  space/time  clustering. 

A  criticism  of  Knox's  method  is  that  the  choice  of  the  critical  time  and  space 
distances  is  arbitrary.  This  problem  was  addressed  for  the  question  of  spatial 
clustering  (31) ,    and  the  method  does  not  require  spatial  boundaries  or  assessment  of 
the  entire  population  base.  An  alternative  approach  is  demonstrated  by  Williams  (32) , 
with  a  sensitivity  analysis  of  the  time  and  space  critical  values. 

A  second  criticism  of  Knox's  method  is  that  it  makes  no  allowance  for  edge  effects 
which  arise  either  from  natural  geographic  boundaries  (e.g.,  coastlines)  or  because 
there  are  unrecorded  cases  outside  the  designated  study  region.  A  new  method  (33) 
addresses  this,  by  altering  the  interpretation  of  expected  pairs  of  close  cases  and 
replacing  the  simple  count  of  close  pairs  by  a  weighted  sum.  Recently,  this  new 
method  has  been  applied  to  test  the  hypothesis  that  many  non-outbreak,  cases  of 
Legionnaires'  disease  in  Scotland  and  not  sporadic  and  to  attempt  to  pinpoint  cases 
clustering  in  space  and  time  (34) . 

It  is  important  to  emphasize  that  because  of  the  diverse  and  complicated  nature  of 
clusters,  there  is  no  single  test  to  assess  them.   The  statistical  sources  suggested 
here  are  intended  only  to  augment  other  epidemiologic  methods  in  a  systematic, 
integrated  approach  (35) ,    coupled  with  flexibility  in  methods  of  analysis  and 
interpretation  of  significance  levels. 

COMPLETENESS  OF  COVERAGE 

Statistical  methods  are  the  basis  of  many  aspects  of  evaluating  a  public  health 
surveillance  system  (36)  .      For  example,  the  question  of  completeness  of  a  surveillance 
system  is  fundamental  to  the  system's  usefulness.   One  approach  to  the  assessment  of 
completeness  involves  a  capture-mark-recapture  technique,  developed  for  the 
enumeration  of  wildlife  populations  (37)  and  used  by  the  U.  S.  Census  Bureau  (38)  . 
The  method  requires  two  parallel  surveillance  systems,  or  a  surveillance  system  and  a 
survey,  measuring  the  incidence  of  a  single  health  event,  and  provides  an  estimate  of 
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true  total  number  of  cases  of  that  health  event  and  the  completeness  of  coverage  of 
the  two  systems. 


The  Chandra  Sekar-Deming  (CSD)  and  Lincoln-Peterson  Capture-Recapture  (LPCR)  Methods 
suggest  the  following  structure  for  the  analysis.   Suppose  two  surveillance  systems 
for  the  same  health  event  report  R   and  S   totals  respectively  for  some  time  period.   In 
addition,  suppose  it  is  possible  to  match  the  cases  so  that  we  know  which  C  of  the 
cases  are  reported  to  both  surveillance  systems.  This  structure  suggests  the 
following  2-by-2  table: 

Surveillance  system  1 


Surveillance 
system  2 


Cases        Cases  not 
reported     reported 


All 
cases 


Cases  reported  C 
Cases  not  reported  Nt 
All  cases  R 


The  CSD  and  LPCR  methods  estimate  N,  the  total  number  of  cases  from  the  combined 
information,  and  provide  a  confidence  interval  for  that  estimate.  Using  the  notation 
suggested  in  the  table  above, 

N  =  [  (R+l)  (S+l)  /  (C+l)  ]  -  1 

Var(N)  =  (R  +  l)  (S+l)  N,  N2  /  [  (C+l)2  (C+2)  ] 

95%  CI  (N)  =  N  +  1.96   Vvar  (N) . 

Thus  the  completeness  of  each  surveillance  system  can  be  calculated  as  follows: 

Completeness  of  #1  =  R  /  N 
Completeness  of  #2  =  S  /  N. 

Consider  the  following  example.   There  exist  two  independent  surveillance  systems  for 
hepatitis  A  for  a  location  with  stable  population.  Suppose  that  the  events  identified 
in  either  of  the  two  systems  are  true  events,  that  the  matching  procedure  identifies 
all  true  matches,  and  only  true  matches  are  identified. 


Surveillance  system  1 


Surveillance 
system  2 


Cases 
reported 


Cases  not 
reported 


All 
cases 
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Cases  reported  790  60  850 

Cases  not  reported  50  X 

All  cases  840  N 

The  estimated  number  of  cases  missed  by  both  systems  is 

X  =  (50  •  60)  /  790  =  3.8  ->  4. 
So,  the  estimated  number  of  cases  in  the  population  under  surveillance  is: 

N  =  790  +  50  +  60  +  4  =  904. 

The  formulas  above  yield  a  95%  confidence  interval  for  N  of  904_+4.   The  completeness 
of  surveillance  system  #1  is  840/904  or  0.93,  and  that  of  surveillance  system  #2  is 
0.94. 

The  usefulness  of  results  from  this  capture-recapture  calculation  is  based  on  four 
assumptions : 

•  Surveillance  is  done  for  a  closed  population. 

•  The  matching  procedure  successfully  identifies  all  true  matches  and,  conversely, 
only  true  matches  are  identified. 

•  All  events  identified  in  either  of  the  two  systems  are  true  events. 
The  two  systems  are  independent. 

Clearly,  these  are  seldom  if  ever  satisfied  for  public  health  surveillance 
systems;  however,  this  should  not  preclude  the  method  as  an  investigative  tool. 
For  example,  at  the  national  level,  the  lack  of  personal  identifiers  precludes 
exact  matching  of  cases  between  surveillance  systems.   However,  other  information 
(age,  gender,  county,  date  of  onset)  may  allow  probability  matching  or  estimates 
of  the  overlap.   Application  of  the  LPCR  method  with  more  stringent  or  relaxed 
matching  criteria  will  yield  bounds  on  the  completeness  of  coverage  still  useful 
for  surveillance  evaluation.   For  example,  if  we  relax  the  matching  criteria  in 
the  table  above  so  that  820  cases  are  reported  to  both  systems,  analogous 
calculations  show  that  the  completeness  of  system  #1  is  0.96,  and  that  of  system 
#2  is  0.98. 
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SELECTION  OF  ANALYTIC  METHODS 

No  single  method  can  be  used  to  detect  all  epidemics  or  all  types  of  aberrations. 
Several  questions  provide  a  framework  for  choosing  an  analytic  method. 

What  is  the  purpose  of  the  surveillance  system?   The  data  used  for  the  CPEG 
analyses  are  reported  weekly  by  state  health  departments.  Although  each  state 
analyzes  its  own  data,  patterns  may  be  apparent  from  the  aggregated  national  picture 
that  may  facilitate  prevention  and  intervention  efforts.  Additionally,  the  data  are 
maintained  historically  for  the  archival  purposes  of  measuring  trends  and  assessing 
the  effects  of  interventions. 

What  is  the  purpose  of  the  analytic  method?   Since  a  single  method  cannot  be 
expected  to  distinguish  between  a  change  in  historical  trend  and  a  one-time  outbreak 
with  unsustained  increases,  the  analyst  must  identify  the  purpose  of  the  analysis 
before  choosing  an  analytic  method.   If  the  nature  of  the  data  is  determined  and  the 
questions  are  well-defined,  the  results  of  the  analytic  method  can  be  used  to  augment 
other  sources  of  information. 

The  purpose  of  CPEG   is  to  facilitate  the  routine  analysis  of  surveillance  data  and  to 
supplement  other  sources  of  information.   The  method  is  not  useful  for  conditions  with 
long-term  historical  trends.  When  the  data  have  complex  patterns,  it  may  be  helpful 
to  remove  (simplify)  some  of  this  pattern  by  modeling.  The  classical  methods  of  time- 
series  analysis  are  appropriate  for  this  situation,  but  these  may  not  be  accessible  to 
the  practicing  public  health  official. 

Which  conditions  should  be  monitored?  Routine  analysis  should  be  reserved  and 
adapted  for  conditions  for  which  there  are  public  health  interventions.   The  CPEG 
methodology  is  most  appropriate  for  conditions  with  historical  trends  that  do  not 
exhibit  frequent  changes  in  trend  or  level  and  that  occur  often  enough  so  that  a 
single  case  or  two  does  not  constitute  a  significant  flag.   If  the  raw  data  are  not 
already  analyzed  for  trend  and  period  effects,  and  the  variance  of  the  numerator 
(present  cases)  cannot  be  assumed  to  have  the  same  variance  as  the  observations  in  the 
denominator  (historical  data) ,  and  if  the  series  exhibits  considerable  correlation  for 
first-order  (adjacent)  observations  and  beyond,  the  CPEG  method  may  be  less  powerful. 
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For  rare  conditions,  the  instability  caused  by  small  numbers  of  reported  cases  may 
make  the  results  unsuitable  for  repeated  use. 

What  is  the  (person,  place,  or  time)  unit  of  analysis?  We  chose  national  data 
for  presentation  of  CPEG.     The  objective  was  to  use  as  short  and  recent  a  time  period 
as  possible  for  weekly  publication,  thus  making  the  results  useful  for  timely 
intervention.   However,  variability  in  weekly  reports  reflecting  factors  other  than 
the  disease  process--e.g. ,  delayed  reports  due  to  outbreaks — made  the  results 
unstable.   We  then  chose  a  4 -week  window. 

Because  of  the  interest  in  analytic  techniques  for  the  analysis  of  aberrations  in 
surveillance  data  at  the  state  level,  six  state  health  departments  evaluated  the 
usefulness  of  the  "CPEG"    (39)  .     During  the  4-month  period  of  study,  a  total  of  210 
episodes  were  observed,  of  which  27  episodes  were  flagged  as  exceeding  historical 
limits;  one  state  had  no  episodes  of  unusual  reporting.  Overall,  14  episodes  (52%) 
represented  epidemiologically  confirmed  outbreaks.  Many  were  small,  and  none  were 
detected  when  aggregated  with  other  state  data  for  the  national  analyses.   Each 
disease  exceeded  historical  limits  at  least  twice  during  the  study  period,  and  for  all 
but  meningococcal  disease,  at  least  one  incident  represented  an  outbreak.  Although 
the  numbers  are  clearly  small,  the  proportion  of  episodes  that  represented  outbreaks 
varies.  This  is  expected  for  conditions  with  different  epidemiology. 

The  five  outbreaks  that  the  health  department  knew  about  but  that  were  not  detected  by 
the  CPEG  method  highlight  some  of  its  limitations.  In  three  outbreaks,  cases  were  not 
reported  nationally  as  current  reports;  thus,  they  were  not  included  with  the  data 
used  for  the  calculation.  The  other  two  outbreaks  were  not  detected  because  of 
concurrent  increases  in  the  corresponding  baseline. 

What  provision  is  there  for  updating  or  correcting  the  data  using  later 
reports?   In  the  NNDSS,  cases  are  reported  as  early  as  possible  and  then  later 
confirmed  or  modified.   The  methodology  of  CPEG   is  applied  to  the  provisional 
(earliest  reported)  data.   In  our  study  of  six  states,  two  of  the  five  outbreaks  that 
were  not  detected  reflected  late  reports  not  included  in  the  current  reporting  period. 


171 

How  is  the  baseline  determined?  The  choice  of  5  years  as  a  baseline  period  was 
based  on  a  consideration  of  appropriate  sample  size  balanced  by  a  desire  to  use  the 
same  method  for  all  conditions.   Although  a  longer  baseline  might  be  used  for  some 
conditions  with  a  long  reporting  history,  epidemics  or  changes  in  trend  in  the 
baseline  will  increase  the  variance  of  the  baseline  and  thus  offset  any  benefit  of 
additional  data.  An  additional  source  of  variation  may  be  increases  in  reporting  due 
to  intensive  investigation.   In  these  cases,  the  analyst  may  choose  to  omit  or  adjust 
the  increased  baseline  data. 

How  are  outbreaks  in  the  baseline  handled?  CPEG   as  presented  here  does  not  adjust 
for  epidemics  in  the  baseline.   The  result  of  this  is  a  progressive  decline  in 
sensitivity--when  an  outbreak  moves  in  and  then  out  of  the  baseline  window.  To 
address  this  point,  one  could  use  a  median  of  the  baseline  reports  (rather  than  a 
mean) .  Unfortunately,  this  replacement  invalidates  the  technique  used  to  compute  the 
point  for  signalling  aberrations,  and  the  alternative  methods  for  calculating  this  are 
not  as  accessible  to  the  practicing  epidemiologist  as  the  CPEG  methodology. 

What  are  the  sensitivity  and,  predictive  value  positive  of  the  method? 
Applying  CPEG   by  states  detected  14  of  19  (74%)  of  outbreaks  and  14  of  27  (52%)  of  the 
episodes  exceeded  historical  levels  were  actually  outbreaks  by  sensitivity  (74%)  and 
predictive  value  positive  (52%)  of  CPEG   in  states  is  therefore  quite  high.   Partly 
because  of  the  use  of  provisional  data,  we  use  the  mean  of  the  historical  baseline  in 
the  calculation.  We  investigated  the  predictive  value  positive  of  the  CPEG   from  six 
state  health  departments  by  asking  each  department  to  follow  up  on  aberrations 
detected  by  this  system.   In  addition,  we  asked  that  outbreaks  that  came  to  their 
attention  through  other  sources  but  had  not  been  identified  by  CPEG   be  noted. 

What  are  the  mechanics  of  operation?  For  any  analytic  method  to  be  useful,  it 
must  be  easily  implemented  in  the  routine  work  of  the  practicing  epidemiologist.   In 
evaluating  the  states  use  of  CPEG   at  the  national  level,  an  epidemiologist  routinely 
evaluated  each  aberration,  analyzed  state  distributions,  and  conveyed  results  to  each 
CDC  program  responsible  for  the  control  of  the  condition.  Additional  information  was 
provided  by  epidemiologists  in  state  health  departments.   Investigation  was  based  on 
this  evidence  in  addition  to  that  obtained  through  other  analysis.   Eventually,  state 
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health  departments  will  have  the  software  to  generate  CPEG   locally. 

Emergent  methods  provide  opportunities  for  the  future  of  surveillance  analysis.  Many 
methods  of  pattern  recognition  are  based  on  Bayesian  concepts,  in  which  a  different 
approach  is  taken  to  the  process  that  generates  the  data--in  this  context,  reports  of 
a  health  event. 

Classical  statistical  theory  regards  the  data  as  arising  from  a  process  with  unknown 
but  constant  parameters.   The  objective  of  classical  methods,  then,  is  to  use  the 
observed  data  to  estimate  or  make  inferences  about  the  unknown  values.  Bayesian 
methods  regard  the  parameters  as  having  prior  distributions,  independent  of  the  data, 
and  the  data  are  used  to  update  or  refine  our  idea  of  this  distribution.   "The  gain  in 
introducing  the  prior  [distribution]  is  partly  that  it  provides  a  way  of  injecting 
additional  information  into  the  analysis  and  partly  that  there  is  a  gain  in  logical 
clarity"  (40)  . 

In  the  application  to  data  generated  over  time  and  space  as  public  health  surveillance 
reports,  the  Bayesian  approach  recognizes  the  value  of  information  beyond  the  mere 
data  history  (e.g.,  a  change  in  the  definition  of  a  reportable  case  of  AIDS  (41).      In 
such  circumstances,  no  statistical  model  can  be  expected  to  predict  such  occurrences 
using  historical  data  only.   "There  is  a  tendency  to  overfit  [sic]  a  particular  past 
realization  at  the  expense  of  the  unrealized  future"  (42)  .   It  is  necessary  to  have  a 
system  in  which  people  can  convey  their  information  to  the  method  and  have  the  method 
convey  this  uncertainty  in  a  way  that  is  useful  for  intervention  and  control. 

One  important  application  of  Bayesian  methodology  is  to  increase  the  stability  of 
observed  rates  of  health  events  on  the  basis  of  data  for  small  populations.   For 
example,  county-level  mapping  may  provide  the  resolution  necessary  to  identify  regions 
with  potentially  elevated  risk,  but  the  high  variability  of  observed  rates  in  counties 
with  small  populations  may  mask  any  underlying  patterns.  A  two-stage  empirical  Bayes 
procedure  (43)    addresses  this  problem  by  augmenting  information  for  one  county  with 
that  of  all  other  counties.   Devine  (44)    applied  this  method  to  mapping  of  injury- 
related  mortality  rates  for  the  United  States  from  1979  through  1987.  This  work 
represents  an  important  step  towards  producing  meaningful  maps  for  small  areas. 
However,  sensitivity  to  model  assumptions  and  consideration  of  spatial  dependence 
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remain  areas  for  investigation. 
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Chapter  VII 


COMMUNICATING    INFORMATION   FOR   ACTION 


Richard  A.    Goodman 

Patrick  L.   Remington 

Robert  J.    Howard 


"All   I  know  is  just  what  I  read  in  the  papers. 


Will  Rogers 


DEFINITION  OF  THE  PROBLEM:  COMMUNICATING 
SURVEILLANCE  DATA 

Standard  definitions  for  public  health  surveillance  specify  the  requirement  for  the 
timely  dissemination  of  findings  to  those  who  have  contributed  and  others  who  need  to 
know  {1-3).      In  the  United  States,  surveillance  findings  have  been  disseminated 
through  the  Morbidity  and  Mortality  Weekly  Report    (MMWR)    series  of  publications, 
public  health  bulletins  in  states,  and  special  reports  in  peer-reviewed  journals. 
However,  even  though  new  technologies  and  epidemiologic  methodologies  have 
dramatically  improved  the  collection  and  analysis  of  surveillance  data,  public  health 
programs  have  lagged  in  developing  effective  approaches  to  the  dissemination  of 
surveillance  f indings--and  to  the  ultimate  successful  communication  of  those  findings. 

As  recently  as  the  1970s,  public  health  surveillance  in  the  United  States  focused 
almost  exclusively  on  the  detection  and  monitoring  of  cases  of  specific  communicable 
diseases,  and  surveillance  data  were  disseminated  primarily  in  a  basic  tabular  format. 
However,  surveillance  efforts  have  expanded  rapidly  and  now  include  chronic  diseases, 
injuries,  occupationally  acquired  conditions,  and  other  problems.   In  addition, 
surveillance  encompasses  problems  as  diverse  as  personal  behavior  (e.g.,  cigarette 
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smoking  and  seat-belt  use);  environmental  insults  (e.g.,  hazardous  materials 
incidents);  and  preventive  practices  (e.g.,  Pap  smears  and  mammographic  screening). 

Because  of  the  fundamental  changes  in  public  health  programs  and  priorities,  programs 
at  all  levels  require  innovative  approaches  to  convey  surveillance  findings  to  new  and 
more  diverse  constituencies.  This  chapter  provides  a  practical  framework  for 
optimizing  dissemination  and  communication  of  information  developed  through  public 
health  surveillance  efforts. 

BASIC  CONCEPTS  FOR  DISSEMINATING  AND  COMMUNICATING 
SURVEILLANCE  INFORMATION 

Surveillance  has  been  characterized  as  a  process  that  provides  "information  for 
action."  This  concept  is  inherently  consistent  with  one  definition  that  described 
communications  as  "...a  process,  which  is  a  series  of  actions  or  operations,  always  in 
motion,  directed  toward  a  particular  goal"  (5).   On  the  basis  of  this  definition, 
then,  public  health  programs  must  ensure  more  than  the  mere  transmission  or 
dissemination  of  surveillance  results  to  others;  rather,  surveillance  data  should  be 
presented  in  a  manner  that  facilitates  their  consequent  use  for  public  health  actions. 
One  fundamental  concept  is  that  the  terms  "dissemination"  and  "communication"  cannot 
be  used  interchangeably.  Dissemination  is  a  one-way  process  through  which  information 
is  conveyed  from  one  point  to  another.   In  comparison,  communications  is  a  loop- 
involving  at  least  a  sender  and  a  recipient  and  is  a  collaborative  process.  The 
communicator's  job  is  completed  when  the  targeted  recipient  of  the  information 
acknowledges  receipt  and  comprehension  of  that  information. 

A  basic  framework  for  disseminating  the  results  of  public  health  surveillance  with  the 
intent  of  communicating  can  be  adapted  from  fundamental  models  for  communications. 
One  such  model—which  emphasizes  the  effect  of  communications-includes  the  sender,  the 
message,  the  receiver,  the  channel,  and  the  impact  (3).  The  sender  is  the  person 
responsible  for  surveillance  of  each  health  condition  being  monitored.   For 
applications  in  public  health  practice,  this  model  can  be  modified  (See  Table  VII. 1). 

Each  of  these  steps  is  discussed  in  greater  detail  in  the  paragraphs  below.   They 
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should  all  be  read  with  the  understanding  that  one  should  never  disseminate  more 
information  than  s/he  can  evaluate  and  revise,  as  needed,  during  the  communications 
process. 

Establish  Message 

The  primary  message  or  communications  objective  for  the  findings  of  any  public  health 
surveillance  effort  should  reflect  the  basic  purposes  of  the  surveillance  system.   In 
this  textbook,  the  purposes  of  surveillance  systems  have  been  described  (Chapters  I 
and  II) .  For  each  of  these  categories,  the  findings  and  interpretation  of 
surveillance  data  may  necessitate  a  different  type  of  public  health  response.   In 
addition  to  disseminating  data  to  those  who  may  have  contributed,  the  communications 
objectives  should  also  dictate  the  delivery  of  the  information  to  the  relevant  target 
groups  and  the  stimulation  of  appropriate  public  health  action,  as  illustrated  below. 

To  detect  and  control  outbreaks 

When  the  purpose  of  a  surveillance  system  is  to  detect  outbreaks  or  other  occurrences 
of  disease  in  excess  of  predicted  levels,  the  primary  communications  objective  should 
be  to  inform  two  groups:  a)  the  population  at  risk  of  exposure  or  disease,  and  b) 
persons  and  organizations  responsible  for  immediate  control  measures  and  other 
interventions.  For  example,  when  surveillance  efforts  detect  influenza  activity  in  a 
specific  locality,  public  health  agencies  can  promptly  disseminate  this  information  to 
health-care  providers  who  may,  in  turn,  intensify  efforts  to  vaccinate  or  provide 
amantadine  chemoprophylaxis  to  persons  at  high  risk  of  complications  from  influenza. 
The  release  and  timing  of  such  messages  should  be  carefully  considered  and  coordinated 
with  appropriate  agencies. 

In  the  context  of  this  example,  the  impact  of  releasing  a  message  recommending  the  use 
of  amantadine  or  influenza  vaccine  may  be  enhanced  if  the  release  has  been  coordinated 
with  public  health  units,  local  pharmaceutical  suppliers,  and  medical  organizations. 

To  determine  etiology  and  natural  history  of  disease 

Public  health  surveillance  for  newly  recognized  or  detected  problems  may  be  initiated 
to  assist  in  determining  the  epidemiology,  etiology,  and  natural  history  of  such 
conditions.   In  such  circumstances,  the  communications  objective  may  simply  be  to 
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provide  information  which  is  sufficient  to  initiate  surveillance. 

For  example,  when  eosinophilia-myalgia  syndrome  (EMS)  was  recognized  in  the  United 
States  in  October  1989,  a  case  definition  was  developed  and  disseminated  to  the  public 
health  community  to  enable  the  immediate  implementation  of  national  surveillance  for 
EMS  (4) .   Surveillance  efforts  were  critical  in  characterizing  the  epidemiology  and 
natural  history  of  EMS,  as  well  as  in  assisting  in  the  development  of  hypotheses 
regarding  its  cause. 

Evaluate  control  measures 

For  many  public  health  conditions,  surveillance  is  the  principal  means  for  assessing 
the  impact  of  control  measures.   Epidemiologic  trends  and  patterns  that  are  based  on 
surveillance  findings  must  be  conveyed  to  persons  involved  in  control  efforts  in  order 
to  refine  control  activities  and  guide  the  allocation  of  resources  in  support  of  those 
activities. 

Following  a  period  of  relative  quiescence,  as  of  the  mid-1980s  the  incidence  of 
measles  in  the  United  States  surged.   When  surveillance  indicated  that  vaccination 
coverage  had  declined  substantially  in  some  groups  (e.g.,  children  residing  in  inner- 
city  locations) ,  key  findings  were  conveyed  to  and  used  by  public  health  programs  and 
primary  care  providers  in  targeting  measles  vaccination  efforts. 

To  detect  changes  in  disease  agents 

In  addition  to  monitoring  trends  in  the  occurrence  of  public  health  problems, 
surveillance  systems  may  be  fundamental  to  the  process  of  detecting  changes  in  disease 
agents  and  the  impact  of  these  changes  on  public  health.  For  example,  in  the  late 
1980s  in  the  United  States,  surveillance  documented  an  increase  in  the  incidence  of 
tuberculosis- -an  increase  substantially  in  excess  of  predicted  levels.   In  addition  to 
this  overall  trend,  transmission  of  multi-drug-resistant  tuberculosis  (MDR-TB)  was 
detected  in  health-care  and  prison  settings  (5) .  The  public  health  implications  of 
these  findings  are  similar  to  the  basic  considerations  outlined  above  for  detecting 
and  controlling  outbreaks:  specifically,  there  is  need  for  timely  and  effective 
notification  of  populations  at  risk  and  of  organizations  responsible  for 
control/prevention  measures.  Therefore,  in  the  case  of  MDR-TB,  the  communications 
objectives  would  include  immediate  notification  of  the  public  health  community  about 
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the  problem  with  the  intent  of  facilitating  implementation  of  proper  diagnostic, 
therapeutic,  and  preventive  measures. 

To  detect  changes  in  health  practices 

Some  surveillance  systems  monitor  changes  in  health  practices  and  behaviors  in  the 
population  rather  than  changes  in  patterns  of  disease  (6) .     This  "life-style" 
information  is  particularly  important  for  problems  such  as  chronic  disease,  for  which 
trends  in  risk  behavior  often  precede  changes  in  health  outcome  by  years  or  even 
decades.  The  communications  objective  in  this  context  is  often  to  increase  awareness 
regarding  the  role  of  behavior  in  causing  disease  or  injury.   In  addition,  this 
information  may  be  used  to  identify  high  risk  groups  in  the  population. 

For  example,  surveillance  data  regarding  trends  in  cigarette  smoking  indicate  that 
smoking  rates  have  not  declined  among  persons  with  lower  educational  attainment. 
Accordingly,  surveillance  data  which  characterize  risk  factors  (such  as  smoking), 
outcomes,  health  services,  and  other  related  factors  may  guide  public  health  programs 
and  decision  makers  in  the  implementation  of  targeted  communitywide  or  statewide 
intervention  strategies   (7). 

Facilitate  planning  of  health  policies 

For  some  conditions,  the  most  appropriate  control  measure  is  promulgation  of  a  public 
health  policy.   In  this  context,  surveillance  information  about  the  public  health 
impact  of  different  conditions  and  problems  must  be  effectively  communicated  to 
legislators  and  public  health  policy  makers. 

For  example,  in  California,  surveillance  information  about  smoking-attributable 
mortality,  morbidity,  and  economic  costs  helped  in  enacting  Proposition  99.  This 
legislation  provided  for  a  25-cent  increase  in  the  state  cigarette  tax  which,  in  turn, 
funded  statewide  initiatives  to  prevent  and  control  the  use  of  tobacco.  Subsequently, 
surveillance  data  regarding  trends  in  the  prevalence  of  smoking  and  the  impact  of  this 
initiative  assisted  in  ensuring  the  application  of  state  funds  to  control  tobacco  use. 
Similarly,  data  for  the  United  States  have  confirmed  that  increases  in  cigarette  taxes 
have  helped  in  reducing  cigarette  smoking  (8). 
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Define  the  Audience 

Identification  of  target  groups  is  an  essential  part  of  the  process  of  developing 
strategies  for  communicating  surveillance  results.   Typically,  public  health 
surveillance  information  and  reports  have  been  disseminated  in  a  standard  format  with 
only  limited  consideration  of  the  target  audiences  and,  more  importantly,  the 
techniques  to  communicate  effectively  to  these  groups.   In  general,  key  target  groups 
may  include  public  health  practitioners,  health  care  providers,  professional  and 
voluntary  organizations,  policy  makers  (e.g.,  from  the  executive  and  legislative 
branches  of  government),  the  press,  or  the  public. 

In  some  instances,  surveillance  information  should  be  disseminated  widely,  in  which 
case  communication  strategies  should  be  tailored  to  subgroups  of  greater  interest. 
For  example,  information  regarding  trends  in  injecting  drug  use  (IDU) -related  risks 
for  HIV  is  often  communicated  to  the  general  public  through  the  newspapers;  however, 
this  strategy  may  be  suboptimal  for  reaching  the  groups  at  highest  risk,  who  use 
alternative  media  such  as  radio  and  television  (9) . 

Select  the  Channel 

Specification  of  the  messages  and  audiences  for  surveillance  results  enable  selection 
of  the  most  suitable  channels  of  communication  for  this  information.  Traditionally, 
surveillance  information  has  been  disseminated  through  published  surveillance  reports. 
However,  in  addition  to  conventional  means  for  communicating  with  traditional 
audiences,  the  advent  of  new  methods  and  technologies  have  made  possible  improved 
communications  with  both  old  and  new  audiences.   This  spectrum  of  communications 
options  includes  professional  and  trade  publications,  electronic  channels,  broadcast 
media,  print  media,  and  public  forums: 

•  Publications:  government  public  health  bulletins  and  surveillance  reports, 
peer-reviewed  public  health  and  biomedical  journals,  newsletters. 

•  Electronic:  telecommunications  systems  (e.g.,  National  Electronic 
Telecommunications  Surveillance  System  [see  Chapter  IV] ,  Public  Health 
Net),  fax  and  batch  fax,  audioconferences,  videoconferences. 
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•  Media:  news  releases,  news  conferences,  fact  sheets,  video  releases. 

•  Public  forums:  briefings,  hearings  and  testimony,  conferences  and  other 
planned  meetings. 

Market  the  Information 

Once  the  message  has  been  defined  and  the  target  audience  and  channel  selected,  it  is 
critical  to  assure  that  the  information  is  communicated  and  marketed- -not  merely 
disseminated- -to  those  who  need  to  know.   In  the  decade  of  the  1990s,  enormous 
quantities  of  information  concerning  public  health  are  communicated  through 
professional  channels,  as  well  as  the  print  and  electronic  media.  Because  of  the 
volume  of  essential  information,  as  well  as  time  constraints,  surveillance  information 
must  be  carefully  tailored  for  presentation  to  each  targeted  audience,  including 
public  health  and  health  care  professionals,  policy  makers,  and  the  public. 

To  ensure  that  surveillance  information  is  readily  communicated  to  target  audiences, 
public  health  agencies  should  use  those  techniques  that  are  most  effective  for 
marketing  information.   First,  as  a  general  principal,  graphic  formats  and  other 
visual  displays  are  likely  to  be  more  effective  in  conveying  information  than 
conventional  tabular  presentations.   Such  formats  include  maps,  bar  graphs, 
histograms,  diagrams,  or  other  ways  of  visually  depicting  data  which  may  not  be 
readily  comprehended  through  tabular  presentation.  For  example,  in  December  1989,  the 
Centers  for  Disease  Control  introduced  a  graphic  format  for  displaying  national 
notifiable  disease  surveillance  data  in  the  Morbidity  and  Mortality  Weekly  Report 
(10).      This  bar  graph  (Figure  V.12),  which  replaced  a  standard  table,  was  designed 
both  to  facilitate  interpretation  of  routine  notifiable  disease  data  and  to  enable 
timely  public  health  responses  to  changes  in  disease  patterns. 

Second,  the  principal  components  of  the  message  can  be  focused  by  selecting  the  most 
important  point,  then  stating  that  point  as  a  simple  declarative  sentence.  This 
message,  termed  the  "single  over- riding  communication  objective  (SOCO) ■ ,  should 
consider  three  questions: 

•  What  is  new? 

•  Who  is  affected? 
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•     What  works  best? 

For  example,  chronic  disease  surveillance  information  data  indicate  that  compared  with 
younger  women,  older  women  are  less  likely  to  have  received  a  Pap  test  in  the  past, 
are  more  likely  to  have  cervical  cancer  diagnosed  at  a  late  stage,  and  have  higher 
mortality  rates  due  to  cervical  cancer.   Traditionally,  this  information  might  be 
disseminated  to  health  care  and  public  health  providers  through  vital  statistics 
reports  and  other  published  accounts  about  cervical  cancer.  However,  if  these 
findings  are  to  be  used  as  a  basis  for  action,  they  first  must  be  synthesized,  then 
effectively  communicated.  Thus,  in  addition  to  presenting  these  findings  in  detailed 
reports,  they  also  may  be  expressed  through  a  single  message,  the  SOCO:  "Older  women 
need  to  get  regular  Pap  tests." 

Third,  techniques  must  be  used  which  present  (or  "package")  the  surveillance 
information  in  a  manner  which  captures  an  audience's  interest  and  focuses  attention  on 
a  specific  issue.  Examples  of  these  techniques  are  the  use  of  introductory  terms  such 
as:  "A  new  study  .  .  .";  "Recent  findings  .  .  .  * ,-  and  "Information  recently  released  . 
..."  These  terms  are  likely  to  appeal  more  to  a  target  audience  than  a  presentation 
which  begins  with  a  conventional  preface,  such  as  "Based  on  recent  surveillance 
findings,  .  .  .  ." 

Fourth,  the  method  and  forum  of  release  of  surveillance  information  may  be  critical-- 
particularly  when  a  timely  release  is  required,  or  when  the  target  audiences  include 
the  media,  the  public,  or  policy  makers.  Under  such  circumstances,  news  conferences 
or  other  news  releases  may  be  considered,  and  should  be  held  when  they  are  likely  to 
be  attended.  Foremost,  the  presenter  should  involve  reporters  in  the  public  health 
surveillance  process  by  "walking  them  through  it",  and  should  recognize  opportunities 
to  articulate  the  SOCO  on  camera  or  in  print.   Important  adjuncts  for  presenting  the 
information  include  readily  available  handouts  and  effective,  but  simple,  visuals. 

Evaluate  the  Effect 

Because  public  health  surveillance  is,  by  definition,  oriented  toward  action, 
evaluation  efforts  should  address  two  considerations:  first,  whether  surveillance 
information  has  been  communicated  to  those  who  need  to  know;  and  second,  whether  the 
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information  has  had  a  beneficial  effect  upon  the  public  health  problem/ condition  of 
interest. 

Assessment  of  whether  surveillance  information  has  been  communicated  to  those  who 
need  to  know  may  be  accomplished  through  a  process  evaluation,  such  as  by  monitoring 
the  distribution  of  the  information  or  a  user  survey.   In  particular,  the 
effectiveness  of  communication  through  newspapers  can  be  evaluated  by  using  clipping 
services  which  determine  the  number  of  published  reports,  the  geographic  distribution 
of  the  reports,  and  the  proportion  of  the  total  audience  to  which  the  reports  have 
been  circulated.   In  addition,  process  evaluation  efforts  should  include  a  review  of 
the  content  of  articles  to  assess  both  the  accuracy  and  appropriateness  of  the 
communicated  message. 

The  second  consideration—the  impact  of  the  communications  effort  on  the  public  health 
problem — requires  an  evaluation  of  outcomes  (e.g.,  knowledge  or  practices)  within 
specific  target  audiences. 

Under  ideal  circumstances,  this  type  of  evaluation  requires  surveys  of  the  target 
audiences  both  before  and  after  the  surveillance  information  has  been  communicated  to 
detect  changes  in  levels  of  outcomes.  The  potential  for  such  evaluation  is 
constrained,  however,  by  technical  and  methodologic  challenges,  as  well  as  substantial 
resource  requirements. 

SUMMARY 

Effective  communication  of  public  health  surveillance  results  represents  the  critical 
link  in  the  translation  of  science  information  section.  Recognition  of  the  key 
components  in  this  process- -including  the  medium,  the  message,  the  audience,  the 
response,  and  the  evaluation  of  the  process--is  the  first  step  in  completing  the 
communications  loop. 
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Chapter  VIII 


Evaluating  Public  Health  Surveillance 


Douglas  N.  Klaucke 


"The  best  way  to  escape  from  a  problem  is  to  solve  it." 


Brandon  Francis 


OVERVIEW 

The  overall  purpose  of  evaluating  public  health  surveillance  is  to  promote  the  most 
effective  use  of  health  resources.  The  highest-priority  public  health  events  should 
be  under  surveillance,  and  surveillance  systems  should  meet  their  objectives  as 
efficiently  as  possible.  Meeting  each  of  these  objectives  involves  evaluating 
surveillance  from  two  different  perspectives;  in  turn,  each  perspective  has  a  slightly 
different  emphasis  in  the  application  of  the  elements  of  surveillance  evaluation. 

TYPES  OF  EVALUATION 
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The  first  level  of  evaluation  answers  the  question,  "Should  this  health  event  be  under 
surveillance?"  This  question  should  be  answered  from  a  perspective  external  to  the 
surveillance  system  itself.   It  is  the  first  question  that  should  be  asked  when 
deciding  whether  to  start  a  new  system  or  before  conducting  a  detailed  evaluation  of 
an  existing  one.  This  "external"  evaluation  is  primarily  an  assessment  of  the  public 
health  importance  of  a  health  event  and  how  its  importance  compares  with  that  of  other 
health  events.  Once  a  health  event  is  identified  as  being  of  high  priority,  it  is 
important  to  consider  both  the  feasibility  and  cost  of  conducting  surveillance  for 
that  event.   If  this  first-level  evaluation  leads  to  a  decision  to  discontinue  a 
surveillance  system,  a  detailed  evaluation  of  that  system  is  superfluous. 

The  second  level  evaluates  an  operating  surveillance  system  for  a  high-priority  health 
event  to  increase  the  system's  utility  and  efficiency.   This  type  of  evaluation  may 
also  compare  two  or  more  systems  involving  the  same  health  event .   This  type  of 
evaluation  will  determine  whether  the  system  is  meeting  its  objectives,  serving  a 
useful  public  health  function,  and  operating  as  efficiently  as  possible.   It  should 
include  at  least  the  following  steps: 

•  An  explicit  statement  of  the  purposes  and  objectives  of  the  system 

•  A  description  of  its  operation 

•  Documentation  of  how  the  surveillance  system  has  been  useful 

•  An  assessment  of  the  different  quantitative  and  qualitative  attributes, 
and 

•  Estimates  of  the  cost  of  the  system. 

The  goal  is  to  maximize  the  system's  usefulness  and  to  achieve  the  simplest,  least 
expensive  system  that  meets  its  objectives. 

ADAPTING  THE  EVALUATION 

Although  all  systems  should  be  assessed  for  their  purpose  and  usefulness,  specific 
attributes  described  below  that  are  critical  to  one  system  may  be  less  important  to 
another.   Efforts  to  improve  certain  attributes--such  as  the  ability  of  a  system  to 
detect  a  health  event --may  detract  from  other  attributes--such  as  simplicity  or 
timeliness.   Thus,  the  success  of  an  individual  surveillance  system  depends  on  the 
proper  balance  of  characteristics,  and  the  strength  of  an  evaluation  depends  on  the 
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ability  of  the  evaluator  to  assess  these  characteristics  with  respect  to  the  system's 
objectives.   Any  approach  to  evaluation  must  therefore  be  flexible. 

Determining  the  most  efficient  approach  to  surveillance  for  a  given  health  event  is  an 
art.  There  is  room  for  creativity  and  opportunity  to  combine  scientific  rigor  with 
practical  realities.  The  methods  discussed  in  this  chapter  should  be  used  as  a  guide 
to  the  types  of  questions  that  need  to  be  answered  about  the  system.   Each  evaluation 
should  be  individually  tailored.  Few  evaluations  address  fully  all  of  the  methods 
outlined  in  this  chapter,  and  many  profitably  focus  on  only  one  or  two  major 
attributes,  such  as  sensitivity  and  timeliness  (1-3)  .      Some  of  these  elements  may  also 
be  useful  for  evaluating  other  health-information  systems  or  evaluating  the  value  of 
secondary  data  sources  for  surveillance. 

Each  of  the  listed  aspects  of  a  surveillance  evaluation  will  be  discussed  in  the 
sections  that  follow:  public  health  importance,  objectives  and  usefulness,  operation 
of  the  system  and  qualitative  attributes  (simplicity,  flexibility,  and  acceptability), 
quantitative  attributes  (sensitivity,  predictive  value  positive,  representativeness, 
and  timeliness),  and  cost.  This  chapter  continues  the  process  through  which  methods 
for  evaluating  public  health  surveillance  systems  evolve  (4,5). 

PUBLIC  HEALTH  IMPORTANCE 

The  public  health  importance  of  a  health  event  and  the  need  for  surveillance  of  that 
health  event  can  be  described  in  a  variety  of  ways.   Health  events  that  affect  many 
people  or  require  large  expenditures  of  resources  are  clearly  important  in  a  public 
health  context.   However,  health  events  that  affect  relatively  few  persons  may  also  be 
important,  especially  if  the  events  cluster  in  time  and  place--e.g.,  a  limited 
outbreak  of  a  severe  disease.   At  other  times,  public  concerns  may  focus  attention  on 
a  particular  health  event,  creating  or  heightening  the  sense  of  importance  associated 
with  it.   Health  problems  that  are  now  rare  because  of  successful  control  measures  may 
be  perceived  as  'unimportant, ■  but  their  level  of  importance  should  be  assessed  on  the 
basis  of  their  potential  to  reemerge.  Finally,  the  public  health  importance  of  a 
health  event  is  influenced  by  its  preventability  and  the  ability  of  public  health 
action  to  influence  it. 
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Some  measures  of  the  importance  of  a  health  event,  and,  therefore,  the  surveillance 
system  that  monitors  it,  include  the  following: 

•  Magnitude  of  the  problem:   Total  number  of  cases,  incidence,  and 
prevalence. 

•  Severity:   Mortality  rate  and  case- fatality  ratio. 

•  Morbidity:  physician  visits,  hospital  days. 

•  Premature  mortality:  Years  of  potential  life  lost  (YPLL) . 

•  Economic  cost:  Costs  of  medical  care,  lost  productivity. 

•  Preventability :   Prevented  fraction. 

Measures  of  importance  used  should  take  into  account  the  effect  of  existing  control 
measures.  For  example,  the  number  of  cases  of  vaccine-preventable  illness  has 
declined  following  the  implementation  of  school  immunization  laws,  and  the  public 
health  importance  of  diseases  in  this  category  is  underestimated  by  case  counts 
alone.  In  such  instances,  it  may  be  possible  to  estimate  the  number  of  cases  that 
would  be  expected  in  the  absence  of  control  programs  (6)  . 

Preventability  can  be  defined  at  several  levels--from  preventing  the  occurrence  of 
disease  (primary  prevention),  through  early  detection  and  treatment,  (secondary 
prevention) ,  to  minimizing  the  effects  of  the  health  problem  among  those  already  ill 
(tertiary  prevention) .   From  the  perspective  of  surveillance,  preventability  reflects 
the  potential  for  effective  public  health  interventions  at  any  of  these  levels. 

The  need  for  surveillance  may  also  be  affected  by  factors  other  than  those  mentioned 
above.   Political  and  public  pressure  may  affect  whether  surveillance  is  undertaken — 
or,  at  the  other  extreme,  forbidden- -for  a  specific  health  event.  Regulations,  laws, 
and  public  health  programs  may  be  implemented  on  the  basis  of  considerations  other 
than  those  listed  above.  However,  it  is  still  important  to  make  the  scientific 
criteria  as  clear  and  explicit  at  possible. 

Even  when  using  quantitative  measures,  judgment  is  necessary  to  decide  which  criteria 
are  most  relevant  for  each  condition.   It  is  important  to  make  these  judgments  as 
explicit--and  as  early--as  possible. 
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Attempts  have  been  made  to  quantify  the  public  health  importance  of  health 
conditions.   Dean  described  such  an  approach  that  involved  using  a  score  that 
accommodated  for  age-specific  mortality  and  morbidity  rates  and  health-care  costs  (7). 
The  Canadian  Laboratory  Centre  for  Disease  Control  has  used  explicit  criteria  in 
setting  national  surveillance  priorities  for  communicable  diseases.  Their  criteria 
include  the  parameters  listed  above,  plus  several  others  such  as  interest  on  the  part 
of  the  World  Health  Organization,  or  the  Department  of  Agriculture  (Canada) ,  potential 
for  outbreaks,  public  perception  of  risk,  and  necessity  for  immediate  public  health 
response.   Their  ratings  for  60  communicable  diseases  can  be  useful  in  setting 
priorities  for  initiating  a  surveillance  system  (8) . 

SYSTEM  OBJECTIVES  AND  USEFULNESS 

The  most  important  steps  in  evaluating  a  surveillance  system  are  a)  describing  the 
health  event (s)  under  surveillance,  b)  stating  explicitly  the  objectives  of  the 
system,  and  c)  describing  how  the  system  has  actually  been  used  to  help  prevent  and/or 
control  disease  or  injury.  These  three  steps  alone  often  sufficiently  indicate  how 
the  system  can  be  improved. 

Case  definition (s)  should  be  specified,  which  include  symptoms,  signs,  laboratory 
results,  and  epidemiologic  information;  a  scale  of  severity;  and  the  different  levels 
of  confidence  in  the  diagnosis  for  each  case,  such  as  "suspected,"  "probable,"  and 
■confirmed. "  Case  definitions  for  nationally  notifiable  diseases  have  been  published 
for  Canada  and  the  United  States  (9,10).      Table  VIII. 1  outlines  a  case  definition 
developed  by  the  Centers  for  Disease  Control  (CDC)  and  the  U.S.  Council  of  State  and 
Territorial  Epidemiologists. 

The  possible  objectives  of  surveillance  systems  and  the  uses  of  surveillance 
information  are  very  similar  and  have  been  reviewed  in  Chapter  I. 

A  surveillance  system  might  also  meet  a  statutory  requirement  based  on  political 
necessity  or  public  pressure  or  might  identify  cases  for  additional  studies.   There 
may  also  be  objectives,  such  as  meeting  the  reporting  requirements  of  the  World  Health 
Organization,  that  might  not  be  of  immediate  or  direct  benefit  to  the  agency  operating 
the  surveillance  system. 
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The  usefulness  of  a  system  should  be  described  specifically,  including  the  actions 
that  have  been  taken  as  a  result  of  the  data  and  analysis  from  the  surveillance 
system,  and  who  used  the  data  to  make  decisions  and  take  actions.   Other  anticipated 
uses  of  the  data  should  be  noted  and  their  feasibility  determined. 

A  surveillance  system  should  contribute  to  the  control  and  prevention  of  adverse 
health  events.   This  process  may  include  an  improved  understanding  of  the  public 
health  consequences  of  the  events.  A  surveillance  system  can  also  be  useful  if  it 
determines  that  an  adverse  health  event  previously  thought  to  have  public  health 
importance  actually  does  not. 

An  assessment  of  the  usefulness  of  a  surveillance  system  begins  with  a  review  of  the 
objectives  of  the  system  and  should  consider  the  dependence  of  policy  decisions  and 
control  measures  on  the  surveillance  system.   Depending  on  the  objectives  of  a 
particular  surveillance  system,  the  system  may  be  considered  useful  if  it 
satisfactorily  addresses  one  or  more  of  the  following  questions.  Does  the  system, 
e.g., 

•  detect  trends  signaling  changes  in  the  occurrence  of  the  health  problem  in 
question? 

•  detect  epidemics? 

•  provide  estimates  of  the  magnitude  of  morbidity  and  mortality  related  to 
the  health  problem  being  monitored? 

•  stimulate  epidemiologic  research  likely  to  lead  to  control  or  prevention? 

•  identify  risk  factors  involved  in  the  occurrence  of  the  health  problem? 

•  permit  assessment  of  the  effects  of  control  measures? 

•  lead  to  improved  clinical  practice  by  the  health-care  providers  who  are 
the  constituents  of  the  surveillance  system? 

Usefulness  may  be  affected  by  all  the  attributes  of  surveillance  described  below. 
Increased  sensitivity  may  afford  a  greater  opportunity  for  identifying  epidemics  and 
understanding  the  natural  course  of  an  adverse  health  event  in  a  community.   More 
rapid  reporting  allows  more  timely  control  and  prevention  activities.   Increased 
specificity  enables  public  health  officials  to  focus  on  productive  activities.   A 
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representative  surveillance  system  will  characterize  more  accurately  the  epidemiologic 
features  of  a  health  event  in  the  population. 

OPERATION  OF  THE  SYSTEM 

To  evaluate  a  surveillance  system,  one  must  know  how  it  operates  (see  Chapter  IV) . 
The  system  description  should  include  the  following: 

The  people  and  organizations  involved, 

The  flow  of  information  (up  and  down) , 

Mechanisms  of  information  transfer, 

Frequency  of  reporting  and  feedback,  and 

Quality  control. 

The  evaluation  should  address  the  following  questions.   What  is  the  population  being 
monitored?  Who  is  responsible  for  reporting  a  case  (and  to  which  public  health 
agency)?  What  information  is  collected  on  each  case,  and  who  is  responsible  for 
collecting  it?   If  there  are  multiple  administrative  levels  represented  in  the  system, 
how  are  the  data  transferred  from  one  level  to  another?  How  is  information  stored? 
Who  analyzes  the  data?  How  are  they  analyzed,  and  how  often?  Are  there  preliminary 
and  final  tabulations,  analyses,  and  reports?  How  often  are  reports  disseminated?  To 
whom?  By  what  mechanisms /media  are  the  reports  distributed?  Are  there  any 
■automatic"  responses  to  case  reports,  (e.g.,  follow-up  of  individual  cases  of  rabies, 
botulism,  or  poliomyelitis)? 

A  diagram  is  often  useful  to  summarize  the  relationship  between  the  various  components 
of  a  system  (Figure  VIII.  1). 

ATTRIBUTES  OF  THE  SYSTEM 

Each  surveillance  system  has  characteristics  or  attributes  that  contribute  directly  to 
its  ability  to  meet  its  specific  objectives.  The  combination  of  these  attributes 
determines  the  strengths  and  weaknesses  of  the  system.  The  attributes  must  be 
balanced  against  each  other,  (e.g.,  high  sensitivity  may  only  be  possible  with  a 
complex  reporting  system  from  a  wide  array  of  providers) . 
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QUALITATIVE   ATTRIBUTES: 

Simplicity  and  Flexibility 

In  describing  a  surveillance  system,  three  desirable  qualitative  attributes  should  be 
addressed:   simplicity,  flexibility,  and  acceptability. 

Simplicity  of  a  surveillance  system  refers  both  to  its  structure  and  to  its  ease  of 
operation.   Surveillance  systems  should  be  as  simple  as  possible,  while  still  meeting 
their  objectives.   It  may  be  useful  to  think  of  the  simplicity  of  a  surveillance 
system  from  two  perspectives:   the  design  of  the  system  and  the  size  of  the  system. 
The  following  measures  might  be  considered  in  evaluating  the  simplicity  of  a  system: 

Amount  and  type  of  information  necessary  to  establish  a  diagnosis. 

Number  and  type  of  reporting  sources, 

Method(s)  of  transmitting  case  information/data, 

Staff  training  requirements. 

Type  and  extent  of  data  analysis, 

Amount  of  computerization, 

Methods  of  distributing  reports,  and 

Amount  of  time  spent  operating  the  system. 

The  cost  estimates  for  a  system  are  also  an  indirect  indicator  of  simplicity.  Simple 
systems  usually  cost  less  that  complex  ones.  Another  consideration  is  the  ability  of 
the  system  to  adapt  to  changing  needs  such  as  the  addition  of  new  conditions  or  data- 
collection  elements.  This  characteristic  is  termed  "flexibility." 

Acceptability 

Acceptability  reflects  the  willingness  of  individuals  and  organizations  to  participate 
in  the  surveillance  system.   This  attribute  refers  to  the  acceptability  of  the  system 
to  health  department  staff  and  at  least  equally  importantly  to  persons  outside  the 
sponsoring  agency,  (e.g.,  doctors  or  laboratory  staff)  who  are  asked  to  report  cases 
of  certain  kinds  of  health  problems.  To  assess  acceptability,  one  must  consider  the 
points  of  interaction  between  the  system  and  its  participants,  including  subjects 
(persons  identified  as  having  cases)  and  reporters.   Indicators  of  acceptability 
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include  the  following:   a)  subject  or  agency  participation  rates;  b)  interview 

completion  rates  and  question  refusal  rates,  if  the  system  involves  case  interviews; 

c)  completeness  of  report  forms;  d)  physician,  laboratory,  or  hospital/facility 
reporting  rates;  and  e)  timeliness  of  reporting. 

QUANTITATIVE  ATTRIBUTES 

The  four  quantitative  attributes  of  a  surveillance  system  include  sensitivity, 
predictive  value  positive,  representativeness,  and  timeliness.  These  are  often 
difficult  to  measure  precisely,  but  even  indirect  estimates  can  be  useful  in  helping 
to  improve  the  efficiency  of  a  system  and  in  comparing  it  with  other  systems. 

Sensitivity 

The  sensitivity  of  a  surveillance  system  can  be  considered  on  two  levels.  First,  the 
completeness  of  case  report ing- - i .e. ,  the  proportion  of  cases  of  a  disease  or  health 
condition  that  are  detected  by  the  surveillance  system  (Table  VIII. 2) — can  be 
evaluated.  Second,  the  system  can  be  evaluated  for  its  ability  to  detect  epidemics 
(11).      (see  Chapters  V  &  VI) . 

The  sensitivity  of  a  surveillance  system  is  affected  by  the  likelihood  that 

•  persons  with  certain  health  conditions  seek  medical  care; 

•  the  condition  is  correctly  diagnosed  which  reflects  the  skill  of  care 
providers  and  the  accuracy  of  diagnostic  tests;  and 

•  the  case  is  reported  to  the  system,  once  it  has  been  diagnosed. 

These  factors  also  apply  to  surveillance  systems  that  do  not  fit  the  traditional 
disease/care-provider  model.  For  example,  the  sensitivity  of  a  telephone-based 
surveillance  system  of  morbidity  or  risk  factors  would  be  affected  by 

•  the  number  of  people  who  have  telephones,  who  are  at  home  when  the 
surveyor  calls,  and  who  agree  to  participate; 

•  the  ability  of  persons  to  understand  and  correctly  answer  the  questions; 
and 

•  the  willingness  of  respondents  to  report  their  status. 
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The  extent  to  which  these  questions  are  explored  depends  on  the  system  and  on  the 
resources  available  for  the  evaluation.   The  measurement  of  sensitivity  in  a 
surveillance  system  requires  the  validation  of  information  collected  through  the 
system,  so  as  to  distinguish  accurate  from  inaccurate  case  reports,  and  the  collection 
of  information  external  to  the  system,  so  as  to  determine  the  frequency  of  the 
condition  in  a  community,  (i.e.  a  "gold  standard.")  (22).   From  a  practical 
standpoint,  the  primary  emphasis  in  assessing  sensitivity — assuming  that  most  reported 
cases  are  correctly  classif ied--is  estimating  what  proportion  of  the  total  number  of 
cases  in  the  community  are  being  detected  by  the  system.   If  this  proportion  is 
estimated  using  methods  that  compare  two  or  more  surveillance  systems,  none  of  which 
is  a  "gold  standard,"  then  this  proportion  should  be  called  an  estimate  of 
"completeness  of  coverage"  rather  than  of  sensitivity.   (See  also  Chapter  VI  on 
capture  recapture) . 

A  surveillance  system  that  does  not  have  high  sensitivity  can  still  be  useful  in 
monitoring  trends,  as  long  as  the  sensitivity  and  predictive  value  positive  remain 
reasonably  constant.   Questions  concerning  sensitivity  in  surveillance  systems  most 
commonly  arise  when  changes  in  patterns  of  occurrence  of  the  health  problem  are 
noted.   Changes  in  sensitivity  can  be  precipitated  by  heightened  awareness  of  a  health 
problem,  introduction  of  new  diagnostic  tests,  or  changes  in  the  method  of  conducting 
surveillance.*****  A  search  for  such  surveillance  "artifacts"  is  often  an  initial 
step  in  investigating  an  outbreak. 

Several  evaluations  have  looked  at  the  sensitivity  or  completeness  of  coverage  of 
surveillance  systems  {13-15) . 

Predictive  value  positive 

Predictive  value  positive  (PVP)  is  defined  as  the  proportion  of  persons  identified  as 
case-patients  who  actually  have  the  condition  being  monitored  {11).      In  Table  VIII. 2 
above  this  is  represented  by  A/ (A+B) . 

In  assessing  PVP,  primary  emphasis  is  placed  on  the  confirmation  of  cases  reported 
through  the  surveillance  system.   Its  effect  on  the  use  of  public  health  resources  can 
be  considered  on  two  levels.  At  the  level  of  an  individual  case,  PVP  affects  the 
amount  of  resources  required  for  investigation  of  cases.   For  example,  where  every 
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reported  case  of  hepatitis  A  is  promptly  investigated  by  a  public  health  nurse,  and 
family  members  at  risk  are  referred  for  a  prophylactic  immune  globulin  injection  each 
reported  case  generates  a  requirement  for  follow-up.  A  surveillance  system  with  low 
PVP  and  therefore  frequent  "false-positive"  case  reports  would  lead  to  resources  being 
wasted  on  cases  that  do  not,  in  fact,  exist. 

The  other  level  is  that  of  detection  of  epidemics.   A  high  rate  of  erroneous  case 
reports  over  the  short  term  might  trigger  an  inappropriate  outbreak  investigation,  and 
conversely,  a  constant  high  level  of  "false-positive"  reports  might  mask  a  true 
outbreak.   In  assessing  this  attribute,  we  want  to  know  what  proportion  of  epidemics 
identified  by  the  surveillance  system  are  "true  epidemics." 

Calculating  the  PVP  requires  confirmation  of  all  cases.   Interventions  initiated  on 
the  basis  of  information  obtained  from  the  surveillance  system  should  be  documented 
and  kept  on  file.   Personnel  activity  reports,  travel  records,  and  telephone  logbooks 
may  all  be  useful  in  estimating  the  impact  of  the  PVP  on  the  detection  of  epidemics. 

A  low  PVP  means  that  a)  non-cases  are  being  investigated,  and  b)  there  may  be  mistaken 
reports  of  epidemics.   "False-positive"  reports  to  surveillance  systems  lead  to 
unnecessary  interventions,  and  falsely  detected  "epidemics"  lead  to  costly 
investigations.  A  surveillance  system  with  high  PVP  will  lead  to  fewer  "less 
unnecessary  and  inappropriate  expenditure  of  resources  (16)  . 

The  PVP  for  a  health  event  may  be  enhanced  by  clear  and  specific  case  definitions. 
Good  communication  between  the  persons  who  report  cases  and  staff  operating  the 
surveillance  system  can  also  improve  PVP.  The  sensitivity  and  specificity  of  the  case 
definition,  as  well  as  the  prevalence  of  the  condition  in  the  population  contribute  to 
the  PVP;  (Table  VIII. 2)  the  PVP  increases  with  increasing  specificity  and  prevalence. 

Sensitivity  and  predictive  value  positive  are  inversely  related.   The  balance  between 
assuring  that  all  (or  almost  all)  cases  are  identified  (high  sensitivity)  and  few 
false  positives  are  identified  (high  PVP)  must  be  based  on  the  level  of  importance 
accorded  to  identifying  all  cases  (e.g.,  for  rabies  or  meningococcal  meningitis)  and 
the  ability  to  use  an  indicator  of  the  disease  in  the  community  (e.g.,  use  of 
Salmonella  laboratory  isolates) . 
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Representativeness 

A  truly  representative  surveillance  system  accurately  describes  the  occurrence  of  a 
health  event  over  time  and  its  distribution  in  the  population  by  place  and  person. 

Representativeness  is  assessed  by  comparing  the  characteristics  of  reported  events 
with  those  of  all  such  events  that  occurred.  Although  this  information  is  not 
generally  available  in  specific  detail,  some  judgment  of  the  representativeness  of 
surveillance  data  is  possible,  on  the  basis  of  knowledge  of  the  following  factors: 

•  characteristics  of  the  population--e.g. ,  age,  socioeconomic  status,  and 
geographic  location  (17); 

•  natural  history  of  the  condition--e.g. ,  latency  period,  fatal  outcome; 

•  prevailing  medical  practices--e.g. ,  sites  performing  diagnostic  tests,  and 
physician-referral  patterns  (18,19); 

•  multiple  sources  of  data--e.g.,  mortality  rates  for  comparison  with  data 
on  incidence,  laboratory  reports  for  comparison  with  physician  reports. 

Representativeness  can  also  be  examined  through  special  studies  of  a  representative 
sample  of  the  population  {16)  . 

The  points  at  which  bias  can  enter  a  surveillance  system  and  decrease 
representativeness  are  illustrated  in  Figure  VIII. 2. 

Case  ascertainment  bias  (Representativeness) 

This  might  also  be  called  "sampling  bias"  and  is  the  differential  identification 
and/or  reporting  of  cases  from  different  populations  or  over  time. 

In  order  to  generalize  findings  from  surveillance  data  to  the  population  at  large,  the 
data  from  a  surveillance  system  should  reflect  the  population  characteristics  that  are 
important  to  the  goals  and  objectives  of  that  system.   These  characteristics  generally 
relate  to  time,  place,  and  person.  An  important  result  of  evaluating  the 
representativeness  of  a  surveillance  system  is  the  identification  of  subgroups  in  the 
population  that  may  be  systematically  excluded  from  the  reporting  system.   This  will 
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enable  appropriate  modification  of  data-collection  practices  and  more  accurate 
projections  of  incidence  of  the  health  event  in  the  target  population. 

Changes  in  reporting  practices  over  time  can  introduce  bias  into  the  system  and  make 
it  difficult  to  follow  long-term  trends  or  establish  baseline  rates  to  be  used  for  the 
recognition  of  outbreaks.  For  example,  switching  from  a  passive  to  an  active  system 
or  changing  reporting  sources  may  change  the  sensitivity  of  the  system.   Publicity  can 
also  increase  rates  of  reporting  in  passive  systems  (20) .      While  more  complete 
reporting  is  desirable  in  principle,  it  is  difficult  to  predict  how  a  change  in 
reporting  practices  or  in  publicity  associated  with  the  reportable  condition  will 
change  the  proportion  of  cases  reported. 

Differences  in  reporting  practices  by  geographic  location  can  bias  the 
representativeness  of  the  system.   For  example,  the  National  Notifiable  Diseases 
Surveillance  System  (NNDSS)  aggregates  data  collected  independently  by  the  50  states, 
Washington,  D.C.  and  several  territories.   For  some  infectious  diseases,  some  states 
collect  data  only  from  laboratories,  whereas  other  states  also  accept  cases  reported 
by  health  practitioners  (21)  .     Also,  despite  efforts  to  achieve  consistency,  case 
definitions  are  not  standardized  across  state  and  territorial  boundaries  (10) . 

Differential  reporting  rates  of  cases  may  occur  in  association  with  different 
characteristics  of  the  person,  so  that  cases  among  certain  subpopulations  may  be  less 
likely  to  be  reported  than  those  among  other  groups.  For  example,  an  evaluation  of 
reporting  on  viral  hepatitis  in  a  county  in  Washington  State  suggested  that  cases  of 
hepatitis  B  were  underreported  among  homosexual  men  and  that  cases  of  hepatitis  nonA- 
nonB  were  underreported  among  persons  exposed  to  blood  transfusions.  The  importance 
of  these  risk  factors  as  contributors  to  the  occurrence  of  these  diseases  was 
apparently  underestimated,  as  indicated  by  the  selective  underreporting  of  certain 
hepatitis  cases  (22) . 

Bias  in  descriptive  information  about  a  reported  case 

Given  that  a  case  of  a  reportable  health  condition  has  been  identified  and  reported, 
there  may  be  errors  in  the  collection  and  recording  of  descriptive  information  about 
the  case,  or  'information  bias." 
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Most  surveillance  systems  collect  more  than  simple  case  counts.   Information  commonly 
collected  includes  the  demographic  characteristics  of  affected  persons,  details  about 
the  health  event,  and  the  presence  or  absence  of  defined  potential  risk  factors.   The 
quality,  usefulness,  and  representativeness  of  this  information  depends  on  its 
completeness  and  validity. 

Quality  of  data  is  influenced  by  the  clarity  of  the  information  forms,  the  training 
and  supervision  of  persons  who  complete  surveillance  forms,  and  the  care  exercised  in 
management  of  data.   A  review  of  these  facets  of  a  surveillance  system  provides  an 
indirect  measure  of  quality  of  data.  An  examination  of  the  percentage  of  "unknown"  or 
"blank"  responses  to  items  on  surveillance  forms  or  questionnaires  is 
straightforward.  Assessing  the  validity  of  responses  requires  special  studies,  such 
as  chart  reviews  or  re- interviews  of  respondents. 

Errors  and  bias  can  make  their  way  into  a  surveillance  system  at  any  stage  in  the 
reporting  and  assessment  process.   Because  surveillance  data  are  used  to  identify 
high-risk  groups,  to  target  interventions,  and  to  evaluate  interventions,  it  is 
important  to  be  aware  of  the  strengths  and  limitations  of  the  information  in  the 
system. 

So  far,  the  discussion  of  attributes  has  been  aimed  at  the  information  collected  for 
cases,  but  many  surveillance  systems  also  involve  calculating  morbidity  and  mortality 
rates.  The  denominators  for  these  rate  calculations  are  often  obtained  from  a 
separate  data  system  maintained  by  another  agency,  such  as  the  Bureau  of  the  Census  or 
the  National  Center  for  Health  Statistics  of  CDC.  Although  these  data  are  regularly 
evaluated,  thought  should  be  given  to  the  comparability  of  categories  (e.g.,  race, 
age,  or  residence)  used  in  the  numerator  and  denominator  of  rate  calculations. 

Several  studies  have  looked  at  quality-assurance  problems  associated  with  surveillance 
data.  A  sample  of  National  Electronic  Injury  Surveillance  System  (NEISS)  records  were 
compared  with  emergency-room  records  to  assess  the  quality  of  data  recorded  in  the 
surveillance  system  {23).     A  study  of  quality  of  national  malaria  surveillance  reports 
was  carried  out  in  the  United  Kingdom  (24).     The  quality  of  Behavioral  Risk  Factor 
Surveillance  System  (BRFSS)  data,  which  are  obtained  through  monthly  telephone 
surveys,  for  behavioral  risks  associated  with  cardiovascular  problems  has  been 
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examined  in  California  {25) .     And  CDC  examined  the  completeness  of  race-ethnicity 
reporting  in  the  NNDSS  {26)  . 


Timeliness 

Timeliness  reflects  the  delay  between  any  two  (or  more)  steps  in  a  surveillance 
system.   The  timeliness  of  the  system  can  best  be  assessed  by  the  ability  of  the 
system  to  take  appropriate  action  based  on  the  urgency  of  the  problem  and  the  nature 
of  the  public  health  response.  Four  points  of  time  in  the  surveillance  process  are 
most  often  considered  when  measuring  timeliness:  a)  time  of  onset  of  disease  or 
occurrence  of  an  injury,  b)  time  of  diagnosis,  c)  time  the  report  of  case  received  by 
public  health  agency  responsible  for  control  activities,  and  d)  time  of  implementation 
of  control  activities.  Usually  one  of  the  first  two  points  of  time  (a  or  b)  is  used 
as  the  starting  point,  and  each  of  the  other  two  points  (c,  d)  is  used  as  an  end 
point . 

Timeliness  is  usually  measured  in  days  or  weeks,  but  in  hospital  settings  it  might  be 
measured  in  hours;  for  diseases  that  do  not  necessitate  an  immediate  response,  it 
might  be  measured  in  months  or  even  years . 

Evaluations  of  the  timeliness  with  which  shigellosis  is  reported  in  two  different 
surveillance  systems  in  the  United  States  found  median  delays  of  11  and  12.5  days  from 
time  of  onset  of  illness  to  receipt  of  report  by  the  public  health  agency  responsible 
for  control  measures.   This  delay  did  not  allow  public  health  officials  to  intervene 
in  a  timely  manner  to  prevent  the  occurrence  of  secondary  or  tertiary  cases.   However, 
such  a  time  frame  might  still  allow  for  effective  intervention  in  settings,  such  as 
day-care  facilities,  in  which  outbreaks  may  persist  for  weeks  or  months  {27) .     Another 
study  of  timeliness  in  the  reporting  of  salmonellosis,  shigellosis,  hepatitis  A,  and 
bacterial  meningitis  looked  at  the  reporting  delay  between  date  of  onset  and  date  of 
report  to  the  CDC  (3) .   Median  reporting  delays  ranged  from  20  days  for  bacterial 
meningitis  to  33  days  for  hepatitis  A.  Wide  variations  in  reporting  delays  were  found 
between  states  as  well.  A  study  in  Australia  showed  that  reports  of  infectious 
diseases  from  laboratories  were  received  by  the  Medical  Officer  of  Health  in  a 
substantially  shorter  time  than  those  received  from  medical  practitioners  (13) . 
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In  contrast,  if  there  is  a  long  latency  between  exposure  and  appearance  of  disease, 
the  rapid  identification  of  cases  of  illness  may  not  be  as  important  as  the  rapid 
availability  of  data  to  interrupt  and  prevent  exposures  that  lead  to  disease. 

The  need  for  a  rapid  reporting  to  a  surveillance  system  depends  on  the  nature  of  the 
public  health  problem  under  surveillance  and  the  objectives  of  the  system.  Recently, 
computer  technology  has  been  integrated  into  surveillance  systems  and  may  promote 
timeliness  of  reporting  {28,29)  . 

COST 

The  final  descriptive  element  is  an  estimation  of  the  resources  used  to  operate  the 
system.   The  estimates  generally  are  limited  to  direct  costs  and  include  the  costs  of 
personnel  and  resources  required  for  collecting,  processing,  and  analyzing 
surveillance  data,  as  well  as  for  the  dissemination  of  information  resulting  from  the 
system. 

Personnel  costs  may  be  determined  from  an  estimate  of  the  time  it  takes  to  operate  the 
system  for  different  personnel.   While  this  can  be  expressed  as  person-time  expended 
per  year  of  operation,  it  is  preferable  to  convert  the  estimate  to  dollar  costs  by 
multiplying  the  person-time  by  appropriate  salary  and  benefit  figures. 

Other  costs  may  include  those  associated  with  travel,  training,  supplies,  equipment, 
and  services  such  as  mail,  telephone,  rent,  and  computer  time. 

The  resources  required  at  all  relevant  levels  of  the  public  health  system-- from  the 
local  health-care  provider  to  municipal,  county,  state,  and  federal  health  agencies- 
should  be  included. 

The  approach  to  resources  described  here  includes  only  those  personnel  and  material 
resources  required  for  the  direct  operation  of  surveillance.   A  more  comprehensive 
evaluation  of  costs  should  examine  consequential  or  indirect  costs,  such  as  follow-up 
laboratory  testing  or  treatment,  case  investigations  or  outbreak  control  resulting 
from  surveillance,  costs  of  secondary  data  sources  (e.g.,  vital  statistics  or  survey 
data) ,  and  costs  averted  (benefits)  by  surveillance. 
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Costs  are  judged  relative  to  benefits,  but  few  evaluations  of  surveillance  systems 
have  included  a  formal  cost-benefit  analysis,  and  such  analyses  are  beyond  the  scope 
of  this  chapter.  Estimating  benefits,  such  as  savings  resulting  from  morbidity 
prevented  through  surveillance,  may  be  possible  in  some  instances,  although  this 
approach  does  not  take  into  account  the  less  tangible  benefits  that  may  result  from 
surveillance  systems.   More  realistically  and  in  most  instances,  costs  should  be 
judged  with  respect  to  the  objectives  and  usefulness  of  a  surveillance  system. 

Alternative  data  collections  may  be  compared  based  on  their  costs  and  number  of  cases 
identified  (See  also  Chapter  XII) .  For  example,  in  Vermont,  two  methods  of  collecting 
surveillance  data  were  compared.   The  'passive"  system  was  already  in  place  and 
comprised  unsolicited  reports  of  notifiable  diseases  to  the  district  offices  or  the 
state  health  department.   The  "active"  system  was  implemented  to  involve  in  a 
probability  sample  of  physicians'  practices.   Each  week  a  health  department  employee 
called  these  practices  to  solicit  reports  of  selected  notifiable  diseases.   In 
comparing  the  two  systems,  an  attempt  was  made  to  estimate  associated  costs.  The 
resources  estimates  directly  applied  to  the  surveillance  systems  are  shown  in  Table 
VIII. 3.   The  active  system  identified  on  additional  23  cases  at  an  average  cost  of 
$861  per  case. 

RECOMMENDATIONS 

On  the  basis  of  the  evaluation,  an  assessment  of  how  well  the  surveillance  system  is 
meeting  its  current  objectives  should  be  made  (Table  VIII. 4).  Modifications  to  the 
system  to  enhance  its  usefulness  and  improve  its  attributes  should  be  considered.  A 
regular  review  of  each  surveillance  system  should  assure  that  systems  remain 
responsive  to  contemporary  public  health  needs. 
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Chapter  IX 


Ethical   Issues 


Robert  A.  Hahn 


"Epidemiologists  [and  surveillance  investigators]  should  be  cognizant  that  many  competing 
values  may  have  moral  weight  equal  to  or  greater  than  the  freedom  of  scientific 
inquiry. . . .there  are  many  clearly  appropriate  social  restraints  on  epidemiologic  research 
[and  surveillance]." 

Beauchamp 


INTRODUCTION 

Webster  defines  ethics  as  "the  discipline  dealing  with  what  is  good  and  bad  or  right  and 
wrong  or  with  moral  duty  and  obligation."  A  professional  code  of  ethics  provides  a  guide 
to  right  and  wrong  behavior.  An  ethical  code  is  not  a  description  of  what  practitioners 
(and  others)  actually  do,  but  rather  a  prescription  for  what  they  should  do.  Ethical 
obligations  derive  principally  from  moral  values--such  as  the  "Golden  Rule, "  presumably 
shared  by  the  broader  society--rather  than  from  scientific  principles,  such  as  "formulate 
a  hypothesis  and  a  method  before  collecting  data. "  However,  ethical  decisions  require 
an  understanding  of  the  objectives,  current  issues,  and  methods  of  the  scientific 
disciplines  to  which  they  refer. 

OVERVIEW 

Over  the  past  several  decades,  much  ethical  discussion  in  health--i  .e. ,  "bioethics"--has 
focused  on  clinical  medicine  and  medical  research,  and  thus  on  physicians  and  their 
patients  and  on  researchers  and  research  subjects.  Because  public  health  is  concerned 
with  the  public,  specific  principles  of  bioethics  may  not  apply  directly  to  public  health, 
although  underlying  moral  values  may  be  shared.  Ethical  principles  associated  with 
surveillance  are  perhaps  closer  to  those  of  the  social  sciences  than  to  those  of  clinical 
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medicine  or  medical  research  (1). 

Indeed,  public  health  ethics  may  conflict  with  the  ethics  of  clinical  medicine  insofar 
as  clinical  ethics- -represented  by  such  issues  as  patient  confidentiality—compromise 
public  health  (e.g.,  when  the  patient's  condition  threatens  the  health  of  others)  ;  or  when 
the  demands  of  public  health  compromise  the  rights  of  individuals  (e.g.,  in  quarantine); 
or  when  mass  vaccination  is  required  for  public  health  despite  the  personal  objections 
of  individual  patients  (2)  .  The  practice  of  public  health  generally  assumes  that 
individual  rights  may  be  ethically  superseded  in  the  pursuit  of  public  well-being  and  a 
greater  public  good  (2)  .  Epidemiologists  and  ethicists  have  recently  collaborated  in  the 
formulation  of  ethical  principles  for  epidemiology  (3) . 

Although  characteristics  may  distinguish  surveillance-related  ethical  issues  from  ethical 
issues  in  other  areas  of  epidemiology  and  public  health,  many  of  the  ethical  issues 
confronting  public  health  surveillance  are  similar  to  those  of  epidemiology. 
Consequently,  much  of  the  discussion  in  this  chapter  draws  heavily  on  experience  in 
epidemiologic  research,  where  these  issues  have  been  more  fully  discussed.  Public  health 
surveillance  may  affect  the  public  in  several  ways.  Surveillance  is  the  principal  means 
by  which  the  health  status  of  the  population  is  assessed;  it  can  be  used  to  identify 
problems,  indicate  solutions,  plan  interventions,  and  monitor  change.  As  such,  public 
health  surveillance  commonly  requires  widespread  and  repeated  contact  with  the  public  it 
serves  regarding  basic  and  often  personal  matters  of  health  and  exposures  to  risk  factors. 
In  addition,  surveillance  systems  may  be  linked  with  other  systems,  requiring  compatible 
identifiers  of  individual  records;  and  systems  may  be  shared  among  researchers  or  public 
health  officials,  thus  increasing  chances  of  public  disclosure.  Many  facets  of 
surveillance  may  infringe  on  individual  privacy  and  therefore  may  increase  the  risk  of 
breaches  of  confidentiality. 

Several  theories  have  been  proposed  to  account  for  the  basic  principles  underlying  sound 
ethical  decisions.  Such  theories  are  relevant  in  public  health  decisions  about  resource 
allocation,  intervention,  surveillance,  and  other  issues,  but  are  only  briefly  mentioned 
here. 

Some  ethicists  dispute  the  possibility  of  formulating  general  ethical  principles,  because 
they  believe  that  correct  ethics  are  specific  to  each  situation  (i.e.,  "situation  ethics") 
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(4).  In  contrast,  most  ethicists  assume  that  ethical  principles  apply  to  different 
situations;  these  ethicists  commonly  adopt  one  of  two  positions  about  the  nature  of 
ethical  rules.  Utilitarians  believe  that  ethical  actions  are  those  that  most  effectively 
distribute  valued  goods  within  the  population;  this  position  is  sometimes  equated  with 
the  epithet,  'the  end  justifies  the  means.*  In  contrast,  deontologists  believe  that 
certain  principles,  such  as  honesty,  are  fundamental,  and  that  ends,  such  as  the 
distribution  of  goods  in  a  population,  do  not  justify  the  violation  of  fundamental 
principles.  Public  health  intervention  programs  commonly  combine  utilitarian  and 
deontological  approaches.  They  attempt  to  maximize  the  distribution  of  health  benefits, 
while  maintaining  a  satisfactory  level  of  morality  in  the  means  of  distribution. 

MORAL  PRINCIPLES  IN  CLINICAL  MEDICINE  AND  RESEARCH 

Ethicists  have  formulated  several  basic  moral  principles  that  they  believe  underlie 
clinical  medicine  and  research  (5)  .  Some  of  these  basic  principles  apply  to  public  health 
surveillance: 

Respect  for  autonomy  asserts  that  "autonomous  actions  and  choices  should  not  be 
constrained  by  others"  (5) .  Basic  to  the  notion  of  autonomy  is  self-determination 
and  voluntary  action. 

Beneficence  is  the  principle  that  one  should  act  to  enhance  the  welfare  of  others. 
Although  non-maleficence,  or  avoiding  acts  that  might  harm  others,  is  sometimes 
viewed  as  a  principle  separate  from  beneficence,  it  may  also  be  regarded  as  the 
first  tenet  of  beneficence.  That  is,  in  order  to  benefit  others,  one  must  at  least 
avoid  doing  them  harm. 

Paternalism  is  the  active  pursuit  of  another  person's  well-being  (as  perceived  by 
the  pursuer) ,  independent  of--and  sometimes  contrary  to-- that  person's  express 
wishes.  Paternalism  may  be  regarded  as  a  form  of  beneficence.  While  paternalism 
is  generally  thought  of  as  protection  of  a  person  against  harm  to  himself /herself , 
the  notion  may  be  broadened  to  include  threatened  harm  to  others.  Paternalism 
commonly  conflicts  with  respect  for  autonomy  and,  perhaps  for  this  reason,  is  not 
a  popular  concept  in  the  United  States.  It  becomes  useful  when  a  person's  capacity 
for  autonomy  is  compromised  (as  may  occur  in  sickness)  or  when  personal  autonomy 
may  seriously  compromise  the  well-being  of  others. 

Justice  is  the  principle  promoting  the  equitable  distribution  of  burdens  and 
benefits  in  society.  Unfortunately,  there  is  no  agreed-upon  definition  of  equity; 
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the  range  includes  an  equal  share  for  each  person,  each  according  to  need,  each 
according  to  effort,  each  according  to  societal  contribution,  or  each  according  to 
presumed  merit  (5) . 

Other  ethical  principles  are  regarded  by  some  ethicists  as  independent  and  by  others  as 

derivative  from  more  basic  principles  (5) : 

Veracity  is  the  duty  of  full  disclosure  of  relevant  information.  Veracity  is  often 
considered  a  duty  of  clinicians  or  researchers  but  may  also  be  a  duty  of  patients 
or  subjects. 

Privacy  is  the  duty  to  respect  a  person's  right  "...of  determining,  ordinarily,  to 
what  extent  his  thoughts,  sentiments,  and  emotions  shall  be  communicated  to  others" 
(6)  .  Privacy  includes  protection  from  unwanted  intrusions,  and  from  the  divulgence 
of  personal  information  to  others.  The  right  to  privacy  may  derive  from  respect 
for  autonomy. 

Confidentiality  is  the  duty  not  to  disclose  information  about  individuals  without 
their  consent.  Confidentiality  may  be  seen  as  a  principle  following  privacy. 
Fidelity,  commonly  applied  to  the  relationship  between  physician  and  patient,  is 
the  duty  to  keep  promises  and  maintain  contracts. 

CONFLICTS  AND  SANCTIONS 

While  conflicts  among  ethical  principles  are  common--e.g. ,  paternalism  versus  respect  for 
autonomy- -there  is  no  simple  prescription  for  resolving  such  conflicts.  Utilitarians 
might  choose  one  alternative  and  deontologists,  another.  Attempts  to  prescribe  principles 
of  conflict  resolution  emphasize  that  decisions  should  be  accompanied  by  justification 
of  the  choice  ( 7)  . 

In  contrast  to  medical  institutions,  institutions  of  public  health  and  epidemiology  do 
not  license  practitioners  and  do  not  maintain  official  sanctions  against  violations  of 
professional  ethical  standards  (even  insofar  as  such  standards  exist  and  are  codified) . 
Public  health  practitioners  are  not  sued  for  malpractice.  Informal  sanctions  (e.g.,  the 
avoidance  of  unscrupulous  colleagues  or  loss  of  one's  job)  occur,  but  have  not  been 
systematically  described.  Some  epidemiologists  have  recently  proposed  an  ethical  duty 
to  monitor  and  address  the  unethical  practices  of  their  colleagues  (7).  In  contrast  to 
the  absence  of  collegial  sanctions  in  public  health,  some  aspects  of  epidemiology  and 
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surveillance  are  governed  by  law  (e.g.,  violations  of  confidentiality  by  surveillance 
personnel)  (8)  . 

Varying  degrees  of  contact  are  involved  in  different  forms  of  surveillance.  Environmental 
surveillance  (e.g.,  of  environmental  lead  or  rates  of  Lyme  disease  infection  of  ticks), 
may  involve  contact  with  animals  or  the  physical  environment  rather  than  with  humans; 
surveillance  using  hospital  records  or  death  certificates  involves  indirect  human  contact; 
surveillance  by  household  interviews  and/or  physical  examinations  requires  face-to-face 
and/or  physical  contact .  Ethical  principles  may  vary  from  situation  to  situation  and  are 
likely  to  be  more  stringent  as  more  human  contact  is  involved. 

This  chapter  focuses  on  surveillance  involving  face-to-face  human  contact.  Also 
considered  are  surveys  such  as  the  Health  Interview  Survey,  the  National  Health  and 
Nutrition  Examination  Survey,  and  the  Vital  Statistics  System  of  the  Centers  for  Disease 
Control's  National  Center  for  Health  Statistics.  These  surveys  or  statistical  systems 
may  not  meet  the  stringent  objectives  of  public  health  surveillance,  but  because  they 
entail  the  collecting  personal  information  on  individuals  and  are  widely  used  for 
surveillance  they  provide  examples  surrounding  data  collection.  The  U.S.  Census  is  also 
considered,  because  census  information  plays  an  essential  role  in  providing  denominators 
for  surveillance  data. 

The  collection  of  public  health  information  may  involve  the  participation  of  many 
individuals  and  institutions.  Potential  participants  include  not  only  the  investigator 
and  subjects  of  surveillance  but  persons  in  the  immediate  social  environment  of  study 
subjects,  the  investigator's  colleagues,  the  broader  public  health  community,  clinicians, 
and  society  at  large.  Explicit  and  implicit  relations  among  these  parties  delineate  their 
ethical  obligations  to  one  another  (Table  IX. 1) .  Ethical  issues  are  reviewed  below  by 
focusing  on  several  of  these  relationships. 

RELATIONSHIPS  IN  SURVEILLANCE  AND  THEIR  ASSOCIATED 
ETHICAL  OBLIGATIONS 


Surveillance  practitioners  and  society  at   large.     The  practice  of  public  health  may 
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be  regarded  as  one  means  by  which  a  society  addresses  issues  of  well-being  in  the 
population.  Public  health  practitioners  retain  an  essential  connection  with  society  at 
large;  ultimately,  they  are  supported  by  and  act  at  the  behest  of  their  public 
constituency-  The  assumption  is  that,  as  they  pursue  and  achieve  public  interests,  they 
should  be  supported  by  society  in  their  work. 

As  agents  of  public  welfare,  public  health  practitioners  have  several  ethical 
responsibilities  as  outlined  below: 

Choice  of  surveillance  topics.  In  pursuit  of  beneficence,  as  well  as  in  upholding 
public  fidelity,  practitioners  should  conduct  surveillance  on  priority  issues  with 
potential  public  health  benefit  (7).  "As  a  parallel  in  a  research  study,  it  would  be 
unethical  to  ask  anyone  to  participate  that  has  little  likelihood  of  producing  meaningful 
results  or  furthering  scientific  knowledge  for  the  good  of  society"  (5).  Insofar  as 
surveillance  findings  are  basic  indicators  of  health  inequities  and  trends,  (e.g.,  in  risk 
or  exposure,  health-care  access,  morbidity,  or  mortality) ,  the  pursuit  of  justice  is  also 
a  primary  moral  rationale  for  surveillance. 

Judgments  of  priority  and  potential  benefit  should  be  based  on  explicit  criteria,  such 
as  the  criteria  for  the  strength  of  scientific  evidence  used  by  the  Preventive  Services 
Task  Force  (10) .  Perhaps  paradoxically,  surveillance  results  themselves  facilitate  the 
determination  of  priority  issues,  (e.g.,  the  magnitude  and  location  of  health  problems 
in  the  population) . 

Avoidance  of  conflicts  of  interest .  As  with  other  epidemiologic  activities, 
surveillance  may  be  prone  to  conflict  of  interest.  "Virtually  all  epidemiologic  research 
is  sponsored,  and  few  if  any  research  sponsors,  public  or  private,  are  disinterested  in 
the  outcome  of  their  epidemiologic  research"  (12).  In  their  commitment  to  public  well- 
being,  practitioners  of  surveillance  must  assure  that  data  are  conducted  to  answer 
scientific  or  public  health  questions  effectively,  rather  than  to  serve  the  interests  of 
financial  and  institutional  sponsors  or  to  "prove"  personal  preconceptions.  For  example, 
practitioners  must  assure  that  populations  surveyed  and  questions  asked  are  appropriate 
to  assess  the  issues  considered  and  not  to  find  "results"  desired  by  a  sponsor. 
Epidemiologists  have  presented  guidelines  for  avoiding  conflicts  of  interest  (22);  the 
guidelines  apply  to  surveillance  activities  as  well. 
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•  The  investigator's  independence  from  the  sponsor  must  be 

maintained  in  the  design,  conduct,  and  reporting  of 
epidemiologic  (and  surveillance)  results.  Written  agreement 
between  investigator  and  sponsor  may  increase  the  likelihood 
of  independence. 

•  Investigations  should  not  be  conducted  in  secrecy,  and  results 
should  be  published  in  a  timely  fashion. 

•  Decisions  on  release  and  publication  of  results  should  not  be 
influenced  by  the  interests  of  sponsors. 

•  All  sponsorship  should  be  acknowledged. 

•  Decisions  regarding  the  dissemination  and  publication  of  results 
should  be  made  by  the  investigator  rather  than  the  sponsor. 

Bond  (23)  has  suggested  that  certain  private  industries  may  have  an  ethical  obligation 
to  monitor  the  effects  of  their  activities  for  instance  the  exposures  and  health  of  these 
employees.  Rothman  (21)  has  argued  that  it  is  unethical  to  judge  the  results  of 
investigations  simply  on  the  basis  of  sponsorship,  e.g.,  private  industry.  Rather, 
investigations  should  be  judged  by  the  quality  of  the  work  involved. 

Methodologic  and  analytic  scrutiny.  The  principle  of  beneficence  requires  that  one 
choose  the  best  feasible  method  of  investigation  and  that  one  appropriately  analyze 
results — thus  requiring  knowledge  of  scientific  methods  (7). 

Interpretation  and  recommendation.  The  principle  of  beneficence  also  requires  (as 
does  the  concept  of  surveillance  itself)  that  surveillance  data  be  interpreted  and  used 
to  assess  and  address  public  health  problems. 

Report  of  findings.  Finally,  the  principle  of  beneficence  requires  that  surveillance 
results  be  reported  understandably,  sensitively,  and  responsibly,  in  a  timely  fashion, 
with  scientific  objectivity  and  caution,  appropriate  confidence,  and  appropriate  doubt. 
"Epidemiologists  should  carefully  avoid  being  placed  in  a  situation  in  which  their  results 
might  be  suppressed  or  inappropriately  edited  by  either  internal  or  external  influences" 
(7).   Some  (14)    have  argued  that  epidemiologists  should  be  advocates  for  the  positions 
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firmly  supported  by  their  data.  Others  (25)  have  asserted  that  epidemiologists  are 
legitimate  expert  witnesses.  Practitioners  of  surveillance  must  also  be  free  of  internal 
or  external  constraints  and  must  be  able  to  present  the  results  of  their  work  objectively. 

INVESTIGATORS  AND  SUBJECTS 

Beneficence 

Surveillance  subjects  do  not  usually  benefit  directly  from  surveillance,  though  some 
benefit  to  them  may  accrue  as  a  side-effect  (e.g.,  when  surveillance  subjects  are  given 
physical  examinations  or  when  a  discovery  made  by  surveillance  serves  a  health  need  of 
a  surveillance  subject) .  When  an  adverse  health  condition  is  determined  in  the  course 
of  surveillance,  it  is  the  responsibility  of  the  investigator  to  provide  the  surveillance 
subject  with  timely  information  about  the  discovered  condition;  if  the  condition  is 
complex  or  sensitive,  such  information  may  be  best  conveyed  by  the  subject's  physician, 
trained  counselors,  or  local  public  health  officials  (9) . 

Non-Malef  icence 

A  more  common  ethical  issue  in  surveillance  is  non-maleficence.  Surveillance  subjects  must 
not  be  harmed  in  the  course  of  the  surveillance  program.  When  invasive  procedures  are 
deemed  necessary  to  the  surveillance  system- -including  psychologically  as  well  as 
physically  invasive  procedures- -care  must  be  taken  that  subjects  do  not  suffer  undue 
reactions  (9) . 

Epidemiologists  have  recognized  a  need  to  be  culturally  sensitive  to  the  populations  they 
are  studying.  Cultural  sensitivity  may  be  a  component  of  beneficence,  non-maleficence, 
and  autonomy,  and  may  also  enhance  the  effectiveness  of  the  investigation.  Cultural 
sensitivity  is  important  not  only  during  the  course  of  surveillance  but  also  in  the 
appropriate  reporting  of  results. 

Non-maleficence  may  also  require  that  survey  participants  be  compensated  for  their 
participation.  Compensation  should  at  least  cover  the  costs  of  participation--e.g. , 
transportation,  lost  work  time,  and  child  care.  While  altruism  and  the  personal 
contribution  to  potential  public  health  benefits  may  motivate  some  prospective 
participants  in  a  data  collection  system,  additional  compensation  may  increase  the 
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participation  of  others--a  pragmatic  rather  than  an  ethical  justification  for  payment. 

Protection  of  Privacy 

Non-maleficence  may  also  underlie  respect  for  privacy.  Protection  of  privacy  requires 
not  only  restraint  in  intrusion  and  in  the  disturbance  of  persons  in  their  private  lives 
but  assurance  that  once  information  (or  a  specimen)  has  been  collected,  it  will  not  be 
distributed  to  others  in  a  form  that  identifies  the  surveillance  subject  (see  Chapter 
X)  (16)  . 

Beauchamp  et  al.  propose  three  situations  in  which  the  invasion  of  privacy  by 
epidemiologists  (and  surveillance  investigators)  is  justified  (7): 

•  The  invasion  of  privacy  is  a  necessary  aspect  of  the 
investigation. 

•  There  is  no  reason  to  suspect  that  subjects  of  the 
investigation  will  be  placed  at  substantial  risk  (e.g.,  of 
being  fired  or  divorced) . 

•  The  research  must  have  potential  social  benefit. 

In  Public  Law  93-579  (17),  the  Congress  states  the  following: 

■ (2)  the  privacy  of  an  individual  is  directly  affected  by 

the  collection,  maintenance,  use,  and  dissemination  of 

personal  information  by  Federal  agencies;... 

(4) the  right  to  privacy  is  a  personal  and  fundamental  right 

protected  by  the  Constitution  of  the  United  States;  and 

(5) in  order  to  protect  the  privacy  of  individuals  identified 

in  information  systems  maintained  by  Federal  agencies,  it 

is  necessary  and  proper  for  the  Congress  to  regulate  the 

collection,  maintenance,  use,  and  dissemination  of  information 

by  such  agencies." 

In  the  United  States,  public  health  surveillance  activities  conducted  under  the  auspices 
of  the  Executive  Branch  (thus  including  the  Department  of  Health  and  Human  Services  and 
the  Bureau  of  the  Census)  are  regulated  by  the  Public  Health  Service  Act  and  by  the 
Privacy  Act  of  1974  (17).  Both  acts  regulate  contractors  of  federal  agencies  as  well  as 
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the  agencies  themselves.  Regulations  apply  to  " establishments °--i .e. ,  institutions--as 
well  as  to  individuals  surveyed.  They  address  "systems  of  records"  "...  from  which 
information  is  retrieved  by  the  name  of  the  individual  or  by  some  identifying  number, 
symbol  or  other  identifying  particular  assigned  to  the  individual"  (27).  Thus,  records 
without  identifiers  are  exempt  from  these  regulations. 

While  the  Privacy  Act  focuses  on  the  disclosure  and  dissemination  of  information  already 
collected,  the  act  also  restricts  surveillance  information  that  may  be  collected  by 
stipulating  that  records  may  contain  only  "such  information  about  an  individual  as  is 
relevant  and  necessary  to  accomplish  a  purpose  of  the  agency...."  This  enforces  the 
ethical  obligation  to  conduct  surveillance  on  issues  with  potential  public  health  benefit. 
In  addition,  the  Privacy  Act  prohibits  use  of  surveillance  (or  other  information)  "for 
any  purpose  other  than  the  purpose  for  which  it  was  supplied  unless  such  establishment 
or  person  has  consented. . .to  its  use  for  such  other  purpose  .  .  .  .'{18). 

The  Privacy  Act  gives  individuals  the  right  to  obtain  their  own  records,  to  correct  errors 
in  the  record,  and  to  receive  an  accounting  of  how  the  record  has  been  disseminated. 
Exemptions  to  individual  access  include  the  use  of  records  maintained  for  statistical 
purposes  only  (rather  than  for  administrative  use)  .  Census  information,  for  example,  is 
exempt.  Exemptions  must  meet  specific  criteria  and  must  be  published  in  the  Federal 
Register. 

The  Privacy  Act  requires  that  federal  agencies  train  and  regulate  personnel  with  access 
to  record  systems  and  that  agencies  maintain  physical  means  of  protecting  records  from 
unwarranted  access.  Agencies  are  also  required  to  describe  their  record  systems  and  to 
report  procedures  used  to  comply  with  requirements  in  the  Federal  Register.  Criminal 
penalties  and  fines  may  be  imposed  on  persons  who  violate  the  stipulations  of  the  act. 

Informed  Consent 

The  Privacy  Act  regulates  not  only  the  collection  and  maintenance  of  record  systems,  but 
the  informed  consent  procedures  by  which  they  are  collected  and  matters  of  confidentiality 
involved  in  the  dissemination  of  records  that  have  been  collected.  Informed  consent  is 
a  requirement  based  on  respect  for  autonomy.  Informed  consent  must  be  attained  primarily 
in  the  context  of  surveys  and  studies.  Administrative,  medical-care,  and  legally  mandated 
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information-collection  systems  should  also  consider  obtaining  informed  consent.  The 
Privacy  Act  requires  that  potential  participants  in  record  systems  be  a)  informed  of  the 
authority  under  which  the  data  are  collected,  b)  explained  the  purposes  of  the 
information,  c)  explained  routine  uses  of  the  information,  and  d)  described  the 
consequences  of  not  participating.  Informed  consent  is  required  for  "establishments" 
(through  their  representatives)  as  well  as  for  individuals. 

Epidemiologists  and  philosophers  have  proposed  several  elements  to  be  included  in 
comprehensive  informed  consent: 

•  Reasonable  disclosure  of  the  goals  and  uses  of  the  study 
(or  surveillance  activity) . 

•  Evidence  of  comprehension  on  the  part  of  prospective 
participants.  The  response  of  potential  respondents  to 
surveys  following  appropriate  information  is  sometimes 
regarded  as  evidence  of  consent,  despite  the  lack  of 
evidence  of  respondent  comprehension  (19)  . 

•  Voluntariness  on  the  part  of  prospective  participants. 
"All  forms  of  duress  or  undue  influence  are  to  be 
scrupulously  avoided"  (  7)  . 

•  Competence  on  the  part  of  prospective  participants. 

•  Consent  of  prospective  participants. 

Possible  harm  of  the  surveillance--e.g.  ,  from  some  physical  test--should  also  be  explained 
to  prospective  participants.  To  guarantee  autonomy,  comprehensive  informed  consent  should 
also  be  receptive  to  informed  dissent  and  non-participation  or  to  withdrawal  at  any  point 
in  the  research  or  surveillance  activity. 

Feinlieb  (5)  argues  that,  "the  first  responsibility  of  the  epidemiologist  to  the  subject 
is  to  be  clear  about  the  objectives  of  the  study."  He  also  allows  that,  when  the  goals 
of  epidemiologic  investigations  (or  surveillance)  are  complex  or  when  full  disclosure 
might  bias  responses,  comprehensive  disclosure  may  not  be  required,  so  long  as  the 
respondent  is  "...not  deliberately  misled  into  participating  in  a  study  that  the 
investigator  knows  is  against  the  respondent's  interests"  (9).  This  paternalistic 
principle  may  compromise  the  participant's  autonomy. 
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Disclosure,  Dissemination,  and  Confidentiality 

The  Privacy  Act  forbids  the  disclosure  of  information  in  which  individual  identity  is 
ascertainable,  unless  the  subject  has  agreed  to  disclosure.  This  principle  thus  protects 
the  confidentiality  of  individuals  and  affects  the  dissemination  of  surveillance  findings 
(see  Chapter  X) . 

Records  protected  by  the  Privacy  Act  are  exempt  from  Freedom  of  Information  Act  (FOIA) 
requests.  FOIA  specifically  exempts  "personal  and  medical  files  and  similar  files  the 
disclosure  of  which  would  constitute  a  clearly  unwarranted  invasion  of  personal  privacy" 
and  matters  "specifically  exempted  from  disclosure  by  statute"  (19).  Federal  surveillance 
data  are  also  commonly  exempt  from  subpoena  and  may  be  explicitly  exempted  by 
authorization  of  the  Secretary  of  Health  and  Human  Services  (18) .  Census  data,  too,  are 
exempt  from  FOIA  access . 

There  are  several  dimensions  of  disclosure  (19)  : 

*****     •      Exact  disclosure,  which  indicates  a  precise  (numerical) 
value  of  some  characteristic,  (e.g.,  precise  income  or  age, 
associated  with  an  individual) ,  versus  approximate  disclosure, 
which  indicates  a  range  of  values  associated  with  an  individual. 

•  Probability-based  disclosure  indicates  the  likelihood  (<100%) 
that  some  characteristic  is  associated  with  an  individual, 
while  certainty  disclosure  indicates  (with  100%  likelihood)  that 
the  characteristic  is  associated  with  the  individual . 

•  Internal  disclosure  associates  an  individual  with  a  characteristic 
on  the  basis  of  evidence  found  within  one  particular  study  or 
survey,  while  external  disclosure  associates  individuals  and 
characteristics  by  linking  studies  or  surveys. 

Since  the  absolute  protection  of  disclosure  might  make  the  use  of  surveillance  information 
impossible  and  would  severely  hamper  programs  of  disease  control  and  prevention,  non- 
disclosure requirements  have  been  interpreted  as  protecting  individuals  from  harm  while 
allowing  appropriate  use  of  surveillance  information.  For  example,  publication  of 
analyses  or  tables  with  small  numbers  of  conditions  such  as  fetal  or  infant  deaths  or 
deaths  from  rabies  in  a  county — allowing  the  identification  of  individuals--is  said  to 
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be  reasonable  because  these  exceptions  "...have  been  accepted  traditionally  and  because 
they  rarely,  if  ever,  reveal  any  information  about  individuals  that  is  not  known  socially" 
(20)  .  Also  exempt  is  publication  of  small  numbers  if  the  identifying  characteristics  are 
judged  not  to  be  "sensitive." 

Two  kinds  of  breaches  of  confidentiality  should  be  differentiated.  In  the  first, 
information  collected  in  confidence  by  a  clinician  or  public  health  practitioner  should 
be  divulged  if  the  information  substantially  threatens  the  welfare  of  another  person 
(21,22).  Divulging  information  need  not  reveal  the  identity  of  the  first  individual,  but 
such  revelation  may  be  unavoidable.  This  is  a  common  occurrence  associated  with  "contact 
tracing"  for  sexually  transmitted  diseases.  The  public  health  responsibilities  of 
clinicians  and  public  health  practitioners  may  override  duties  of  confidentiality  to 
individual  patients  and  surveillance  subjects,  even  though  their  actions  abrogate  privacy, 
autonomy,  and  even  beneficence.  In  the  second  kind  of  breach  of  confidentiality, 
revelation  of  information  and  the  identity  of  an  individual  serves  no  public  health 
purpose  and  is  therefore  unethical. 

Several  techniques  may  mitigate  the  likelihood  of  disclosure  and  may  legitimate  the 
publication  of  otherwise  protected  data:  a)  small  samples  (e.g.,  <10%  of  the  data)  hamper 
efforts  to  identify  which  individual  in  the  population  a  sampled  individual  represents, 
b)  the  deliberate  creation  of  errors  or  imputations  of  missing  data  allows  that  any  given 
datum  may  be  an  error  or  an  imputation  rather  than  a  true  observation,  c)  incompleteness 
of  reporting  allows  that  an  individual  may  not  have  been  included  in  the  survey,  and  d) 
lack  of  sensitivity  of  the  information  in  question  (because  of  prior  publication  or 
historical  time  frame),  so  that  publication  reveals  no  harmful  information. 

In  the  United  States,  individual  states  use  surveillance  information  for  their  own 
disease-control  programs.  As  major  surveillance  agencies,  the  states  have  been  critically 
concerned  with  issues  of  confidentiality  (23) .  While  all  states  have  provisions  for 
complying  with  freedom  of  information  requests  and  maintaining  confidentiality  of 
information,  they  vary  in  specific  regulations  and  their  enforcement.  Twenty-five  states 
have  general  confidentiality  requirements  with  little  specific  definition;  seven  states 
require  written  consent  for  release  of  information;  five  states  exclude  surveillance 
information  from  subpoena;  and  10  states  have  penalties  for  unlawful  disclosure  of 
information  on  some  or  all  reported  infectious  diseases  (23) .  The  states  are  concerned 
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with  the  protection  of  the  confidentiality  of  data  released  for  federal  surveillance 
systems  and,  in  collaboration  with  CDC,  have  established  confidentiality  guidelines  (23)  . 

Several  procedures  are  commonly  used  to  protect  the  confidentiality  of  records  in 
surveillance  investigation  settings,  disseminated  data  sets,  and  published  tabulations 
and  analyses: 

a.  Names  or  other  personal  identifiers  are  necessary  in  public  health 
surveillance  for  two  principal,  related  purposes:  to  follow  up  individuals  for  the 
determination  of  subsequent  health  events  and  to  link  data  systems  for  additional 
information  on  individuals.  Surveillance  functions  which  require  neither  follow-up  nor 
linkage  may  avoid  problems  of  confidentiality  by  not  using  names  or  other  identifiers. 
It  should  be  noted,  however,  that  the  absence  of  identifiers,  as  in  "blinded"  studies, 
may  preclude  informing  surveillance  subjects  of  adverse  surveillance  findings. 

b.  When  names  or  other  identifiers  are  justified,  problems  of  disclosure  may 
be  minimized  with  use  of  protected  or  "scrambled"  identifiers,  which  make  association 
between  records  and  individuals  difficult.  The  use  of  identifiers  in  record  systems  and 
separate  files  relating  identifiers  and  individuals  maintained  in  separate,  secure  areas 
is  a  common  means  of  minimizing  disclosure. 

c.  Identifying  information  can  be  destroyed  once  it  has  served  its  designated 
follow-up  or  linkage  function. 

d.  Avoiding  the  collection  of  data  that  will  not  be  used  and  that  might  serve 
to  identify  individuals . 

e.  Precise  data--e.g.,  dates  of  birth  or  death  or  income  in  exact  dollar 
amounts,  residence  by  block  or  street  or  address--are  rarely  essential;  data-range 
specifications  are  most  often  adequate  for  surveillance  purposes.  Since  precise  data 
facilitate  identification  of  individuals,  the  use  of  data  ranges  is  preferable  if 
surveillance  goals  can  be  achieved  with  such  information. 

f .  In  some  surveillance  investigations,  linkage  with  other  surveillance  sources 
is  necessary  to  determine  additional  information.  In  this  case,  the  Privacy  Act  requires 
that  federal  agencies  and  personnel  involved  be  trained  in  and  comply  with  common 
regulations  of  privacy  and  confidentiality. 

g.  Suppression  of  analyses  or  tables  with  cells  with  small  numbers  in 
publications  (19)  : 

h.     i)  no  table  should  include  a  row  or  column  in  which  all 
cases  are  found  in  one  cell. 


224 


ii)  the  marginal  total  of  any  row  or  column  should  not  be 

fewer  than  three, 

iii)  no  estimate  should  be  based  on  fewer  than  three  cases, 

iv)  no  estimates  should  be  published  if  one  case  contributes 

more  than  60%  to  that  estimate, 

v  &  vi)  no  characteristics  of  individuals  should  be 

identifiable  by  calculation  from  other  tabulated  data  in 

the  same  or  other  data  sets.  Solutions  to  the  problem  of 

small  numbers  may  be  the  aggregation  of  rows  or  columns  or 

the  suppression  of  data  in  cells  and  marginal  totals. 


Veracity 

In  the  ethics  of  public  health  surveillance,  the  principle  of  veracity  is  usually 
considered  in  the  disclosure  by  investigators  of  the  goals  and  uses  of  surveillance 
information.  However,  veracity  may  also  be  an  ethical  duty  of  surveillance  subjects  (to 
the  investigator  as  well  as  to  society)  once  they  participate.  Deception  by  subjects  may 
contribute  to  erroneous  results  and  public  health  harm. 

Investigators  and  Persons  in  Subjects'  Social  Environments 

During  the  course  of  surveillance,  it  may  be  discovered  that  some  condition  of  the 
surveillance  subject  (e.g.,  an  infectious  disease  or  violent  intentions)  might  severely 
affect  or  might  have  affected  the  well-being  of  other  persons  in  the  subject's  social 
environment.  In  this  case,  it  may  be  the  ethical  duty  of  investigators  to  inform 
appropriate  authorities  (e.g.,  public  health  officials  or  law  enforcement  agents)  of  these 
circumstances  (9).  Paternalistic  social  beneficence  might  justify  the  breach  of 
confidentiality. 

Surveillance  and  the  Public  Health  Community 

Public  health  surveillance  practitioners  have  the  duty  of  having  their  work  reviewed  by 
colleagues  for  ethical  as  well  as  scientific  integrity;  they  also  have  the  responsibility 
of  reviewing  the  work,  of  others.  The  review  process  requires  the  sharing  of  methods  and 
findings.  Ethical--as  well  as  scientific — critiques  must  be  balanced.  "Epidemiologists 
and  many  research  scientists  often  search  in  detective-like  fashion  for  flaws  in  the 
studies  of  those  they  review,  even  though  the  studies  may  contain  substantial  merit"  (7) . 
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While  some  agencies  have  policies  to  protect  researchers'  primary  use  and  control  of  the 
data  they  collect  (24) ,  others  have  favored  broader  access  (25) .  Ethical  principles 
justifying  broad  access  are  detailed  below. 

•  Enhancing  the  quality  of  science  by  allowing  reanalysis 

and  confirmatory  studies--thus  potentially  contributing 
to  public  welfare 

•  Expanding  knowledge  by  facilitating  additional  analyses-- 
thus  also  potentially  contributing  to  public  welfare 

•  Reducing  the  burden  of  surveillance  on  subjects 

•  Reducing  the  burden  of  surveillance  on  practitioners 

Epidemiologists  and  ethicists  have  also  argued  that  practitioners  have  the  obligation  to 
promote  ethical  behavior  in  the  public  health  community  and  to  confront  ethically 
unacceptable  behavior  of  colleagues  ( 7) . 

CLINICIANS  AND  THE  PUBLIC  HEALTH  COMMUNITY 

Physicians,  laboratorians,  and  other  health-care  practitioners  play  a  critical  role  in 
reporting  infectious  diseases  to  local  and  state  health  departments.  Reporting  traumatic 
events  (e.g.,  gunshot  wounds  and  child  abuse)  is  also  required  in  some  states  {26). 
Fulfilling  these  duties  may  prevent  further  infection  or  trauma.  While  reporting  selected 
diseases  and  injuries  is  mandatory  for  physicians  and  others  in  all  states,  completeness 
of  reporting  is  said  to  range  from  6%  to  90%  for  many  notifiable  diseases  (27)  ;  reporting 
laws  are  seldom  enforced. 

Investigators  and  Clinicians 

Investigators  have  a  duty  to  report  findings  to  clinicians.  Findings  may  concern  the 
welfare  of  a  clinician's  patients  who  have  been  surveillance  subjects.  Findings  from 
surveillance  investigations  may  also  have  implications  for  patients  in  general  or  patients 
with  certain  conditions. 

The  scale  and  significance  of  public  health  surveillance  demand  scrupulous  and  ongoing 
attention  to  ethics  as  well  as  to  science  (Table  IX. 2) .  Ethics  should  not  be  regarded 
as  an  afterthought,  or  worse,  an  obstacle,  to  professional  practice,  but  as  an  element 
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vital  to  its  foundation  and  goals. 
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CHAPTER   X 


Public  Health  Surveillance  and  the   Law 


Gene  W.   Matthews 
R.    Elliott  Churchill 


"The  people's  good  is  the  highest   law." 

Marcus  Tullius  Cicero 


INTRODUCTION 

Public  health  surveillance  and  the  law  are  joined  by  so  many  interconnecting  links  that 
virtually  every  aspect  of  a  surveillance  program  is  associated  with  one  or  more  legal 
issues.  In  the  United  States,  and  throughout  the  world,  many  surveillance  efforts 
have  been  effected  through  mandates  enforced  by  statutes  or  regulations.  By  the  same 
token,  reports  derived  from  the  interpretation  and  application  of  data  from 
surveillance  programs  have  been  used  to  drive  legislation  relating  to  public  health. 

Public  health  surveillance  involves  the  collection,  analysis,  interpretation,  and 
dissemination  of  data.  It  may  be  useful  to  have  a  working  definition  of  the  law  to 
meld  with  this  description  of  surveillance.   In  essence,  as  Wing  observes,  the  law 
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is  "the  sum  or  set  or  conglomerate  of  all  of  the  laws  in  all  of  the  jurisdictions: 
the  constitutions,  the  statutes  and  the  regulations  that  interpret  them,  the 
traditional  principles  known  as  common  law,  and  the  judicial  opinions  that  apply  and 
interpret  all  these  legal  rules  and  principles*  (1).  However,  that  is  by  no  means 
all.  The  law  is  also  the  legal  profession,  and,  in  order  to  understand  the  law,  we 
must  try  to  understand  the  lawyers--how  they  think,  how  they  speak,  and  what  roles 
they  play  in  the  legal  process.  In  addition,  from  a  very  practical  point  of  view, 
the  law  is  also  the  legal  process—legislatures  and  their  politics,  as  well  as  the 
time,  efforts,  and  costs  associated  with  changes  in  legislation.  Finally,  the  law 
is  what  it  is  interpreted  to  be.  This  takes  us  back  to  the  lawyers,  as  well  as  to 
the  judges  in  the  legal  system. 

We  cannot  avoid  what  Wing  describes  as  'the  traditional  barrier"  between  the  legal 
profession  and  the  rest  of  the  world.  He  continues  with  the  observation  that  'the 
legal  profession  has  for  centuries  done  many  things  to  surround  the  practice  of  law 
with  a  quasi -mystical  aura.  Much  as  the  medical  profession  would  have  us  believe 
that  there  is  something  almost  sacred  about  medical  judgment  and  that  only  a 
physician  can  understand  it,  lawyers  have  perpetuated  the  only  partially  justified 
myth  that  there  is  something  called  legal  judgment  that  only  someone  with  the  proper 
mix  of  formal  education,  practical  experience,  and  appropriate  vocabulary  can  make' 
(1)  . 

'The  basic  function  of  the  law  is  to  establish  legal  rights,  and  the  basic  purpose 
of  the  legal  system  is  to  define  and  enforce  those  rights  ....  Legal  rights"  are 
the  "relationships  that  establish  privileges  and  responsibilities  among  those 
governed  by  the  legal  system"  (1) .  This  concept  of  "legal  rights"  does  not  purport 
to  cover  freedoms  or  interests  given  unconditional,  global  protection,  but  rather  it 
covers  the  protection  of  carefully  specified  interests  against  the  effects  of  other 
carefully  specified  interests.  Finally,  some  rights  are  protected,  not  by  statute 
or  regulation,  but  by  an  understanding  and  application  of  the  prevailing  ethics  in 
an  area.  In  general,  ethics  are  regulated  through  whatever  sanctions  are  imposed 
against  censured  behavior  by  peers  or  colleagues  (see  Chapter  IX) . 
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This  orientation  is  pivotal  in  our  discussion  of  legal  issues  associated  with 
surveillance  because  the  reader  must  continue  to  be  alert  to  the  fact  that  everything 
in  this  chapter  is  subject,  first  of  all,  to  different  interpretations  in  different 
legal  settings,  and,  second,  to  amendment  of  both  statute  and  practice. 

The  task  of  surveillance  as  an  applied  science  could  be  simplified  considerably  by 
avoiding  any  discussion  of  legal  issues.  Although  this  observation  is  probably 
valid,  we  have  already  pointed  out  that  surveillance  very  often  takes  place  under 
statute.  Beyond  this  fact,  the  relevance  of  the  definition  of  the  police  powers  of 
a  state  must  be  acknowledged,  i.e.,  "powers  inherent  in  the  state  to  prescribe, 
within  the  limits  of  state  and  federal  constitutions,  reasonable  laws  necessary  to 
preserve  the  public  order,  health,  safety,  welfare,  and  morals"  (2).  That  describes 
a  sweeping  scope  of  authority  and  certainly  covers  anything  that  would  be  dealt  with 
under  the  heading  of  "public  health  surveillance. " 

In  other  words,  one  cannot  look  at  surveillance  and  claim  to  have  created  an  accurate 
picture  without  considering  the  legal  constraints  and  processes  that  accompany  it-- 
particularly  since,  for  public  health  surveillance,  we  have  added  the  component  of 
■timely  dissemination  of  the  findings"  to  our  definition  of  surveillance.  How 
information  is  collected,  from  and  about  whom  it  is  collected,  how  it  is  interpreted, 
and  how  and  to  whom  the  results  are  disseminated  all  must  be  scrutinized  under  the 
umbrella  of  "accepted  practice"  and  "the  law."  The  sections  that  follow  contain 
information  specific  to  the  United  States,  but  for  an  international  orientation,  the 
issues  and  concerns  remain  basically  constant,  while  the  written  body  of  the  law  and 
the  process  through  which  the  law  is  enacted  and  enforced  vary  widely. 

If  the  reporting  component  of  public  health  surveillance  is  treated  as  a  requirement, 
one  can  assert  that  such  surveillance  began  in  the  United  States  in  1874  in 
Massachusetts,  when  the  State  Board  of  Health  instituted  the  first  statewide 
voluntary  plan  for  weekly  reporting  of  prevalent  diseases  by  physicians.  By  the  turn 
of  the  century,  the  forerunner  of  the  Public  Health  Service  had  been  established,  and 
laws  in  all  states  required  that  certain  communicable  diseases  be  reported  to  local 
authorities  (3) . 
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SURVEILLANCE  IN  THE  EARLY  YEARS  (1900-1930) 

With  the  development  and  growth  of  surveillance  in  the  United  States  in  the  early  1900s 
came  the  inevitable  conflicts  created  when  the  interests  of  one  human  being  conflict  with 
those  of  another  individual  or  political  unit.  Much  of  the  debate  took  place  because  of 
the  problem  the  United  States  was  experiencing  with  sexually  transmitted  diseases — which 
became  even  more  acute  with  the  participation  of  American  troops  in  World  War  I.  The 
issues  were  basically 

•  the  moral  dilemma  created  by  not  reaching  consensus  on  the  purpose  of 
information  obtained  through  surveillance  (i.e.,  whether  to  direct  control 
efforts  toward  sexual  behavior  of  the  individual  or  toward  the  disease 
agents) , 

•  the  debate  surrounding  the  duty  of  the  physician  to  his/her  patient  and  to 
society,  and 

•  the  disagreement  about  whether  government  provision  of  health  services 
comprised  unfair  competition  to  the  private  practitioner. 

Since  these  concerns  still  have  not  been  completely  resolved  in  the  United  States  as  of 
the  1990s,  they  are  examined  in  more  detail. 

Social  Hygiene  Versus  the  Scientific  Approach 

By  the  early  1900s,  the  epidemiology  of  syphilis  was  reasonably  we 11 -documented.  This 
understanding  did  not  constitute  an  unmixed  blessing.  As  William  Osier  told  his  students 
at  the  Johns  Hopkins  Medical  School  in  1909,  'In  one  direction  our  knowledge  was  widened 
greatly.  It  added  terror  to  an  already  terrible  disorder"  (4)  .  Aside  from  the  scope  of 
the  destructive  powers  of  syphilis,  physicians  were  just  beginning  to  appreciate  the  fact 
that  many  "innocent  victims"  were  contracting  this  disease.  The  prevailing  wisdom  of 
earlier  years  of  'reaping  what  one  sowed, "  as  well  as  other  statements  of  poetic  and  moral 
justice,  was  no  longer  adequate  when  women  of  "good  family"  and  unblemished  reputation 
were  known  to  have  contracted  syphilis  from  their  spouses  and  when  children  suffered 
severe  effects  from  congenital  syphilis. 

What  the  medical  and  public  health  officials  apparently  had  the  most  difficulty 
reconciling  was  how  to  direct  their  efforts  to  deal  with  the  growing  problem  of  syphilis. 
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Both  surveillance  and  treatment  efforts  could  be  directed  toward  a)  people,  a  focus  on 
behavior  modification  through  education  as  a  control  strategy  or  b)  the  disease  vector, 
a  focus  on  the  organism  that  caused  the  disease  and  how  to  eradicate  it  from  individuals 
and  society  at  large.  Neither  approach  to  syphilis  control  was  ever  agreed  to  be  the 
ideal,  and,  in  fact,  the  two  in  combination  have  still  not  proved  totally  effective.  The 
tensions  represented  by  the  "moralistic"  and  the  "scientific"  approaches  are,  moreover, 
still  quite  evident  in  public  health  practice  and  surveillance  in  the  1990s. 

One  only  has  to  review  the  popular  press  for  the  past  several  years  to  see  how  the  "moral 
versus  scientific"  dilemma  relates  to  public  health  in  the  context  of  such  currently 
serious  problems  as  human  immunodeficiency  virus/acquired  immunodeficiency  syndrome 
(HIV/AIDS)  and  the  reemergence  of  multidrug-resistant  strains  of  tuberculosis. 

Duty  of  Physicians 

The  concept  of  the  confidential  nature  of  communication  between  patient  and  physician  is 
clearly  stated  in  the  Hippocratic  Oath  and  has  continued  to  be  emphasized  in  legal  and 
social  settings.  In  the  context  of  the  syphilis  epidemic  in  the  United  States  in  the 
early  years  of  the  20th  century,  this  concept  became  a  crucial  point  of  debate  in  efforts 
to  control  the  spread  of  the  disease.  Physicians  did  not  wish  to  breach  the  confidence 
relied  on  by  their  patients  by  reporting  cases  of  syphilis  to  the  authorities;  by  the  same 
token,  if  they  did  not  report  the  occurrence  of  syphilis--if  not  to  the  authorities  at 
least  to  the  patients'  spouses--they  were  tacitly  participating  in  the  continued 
transmission  of  the  disease  to  "innocent  victims."  The  entire  issue  boils  down  to  primary 
responsibility  to  an  individual  or  to  society.  It  clearly  has  not  been  resolved  but 
constitutes  an  important  component  of  the  success  or  failure  of  present-day  surveillance 
efforts. 

Economic  Competition 

Also  as  yet  unresolved  is  the  problem  created  for  public  health  officials  and  for 
practicing  physicians  in  the  early  1900s  by  the  need,  on  the  one  hand,  to  have  physicians 
report  all  cases  of  sexually  transmitted  disease  and  to  establish  public  health  clinics 
to  provide  prompt  treatment  and  education  to  patients  and,  on  the  other  hand,  the  need 
for  public  health  officials  to  protect  the  financial  interests  of  physicians  by  not 
infringing  on  their  turf  and  removing  paying  customers  to  free  or  financially  subsidized 
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facilities.  At  the  same  time,  it  did  not  seem  reasonable  to  expect  the  physicians  to  make 
such  reports  and  refer  such  patients  for  treatment  elsewhere  when  it  would  mean,  in 
essence,  taking  money  out  of  their  own  pockets.  For  surveillance  efforts,  this  dilemma 
guaranteed  underreporting  of  cases,  with  the  selective  reporting  of  cases  representing 
patients  who  could  not  pay  and  the  withholding  of  reports  of  cases  representing  patients 
who  could  pay . 

Of  concern  to  the  1990s  surveillance  effort,  and  again  in  the  context  of  HIV/AIDS, 
physicians  might  choose  not  to  report  cases  of  HIV  positivity  for  fear  their  patients 
might  be  discriminated  against  in  a  work  or  social  setting.  Problems  with  insurance 
coverage  might  also  lead  to  such  underreporting. 

ERA  OF  GRADUAL  GROWTH  IN  MANDATED  SURVEILLANCE  ( 1940s - 
1970s) 

During  the  period  of  the  1940s-1970s,  states  added  many  diseases  to  their  mandatory 
reporting  lists.  Even  in  states  that  did  not  enact  legislation  to  require  additional 
reporting,  surveillance/reporting  efforts  were  broadened  during  this  period  through  state 
regulation  or  directive  from  the  state  health  commissioners  (5) . 

In  contrast,  surveillance  and  reporting  to  agencies  in  the  federal  government  were--and 
continue  to  be- -voluntary .  The  resulting  discrepancy  in  data  obtained  on  a  particular 
disease  at  the  state  and  federal  levels  leads  to  problems  in  analysis  and  interpretation. 
However,  several  professional  organizations,  including  the  Association  of  State  and 
Territorial  Health  Officers  (ASTHO)  and  the  Council  of  State  and  Territorial 
Epidemiologists  (CSTE) ,  have  been  instrumental  in  setting  up  a  patchwork  system  to 
coordinate  and  improve  the  quality  and  completeness  of  surveillance  data. 

A  major  factor  in  the  development  of  surveillance  planning  and  implementation  during  this 
period  is  represented  by  the  institution  in  1976  of  the  Federal  Protection  for  Human 
Subjects  Regulations.  One  of  the  most  well-known  of  the  regulations  states  the 
requirement  that  "informed  consent"  be  obtained  from  any  person  who  is  asked  to 
participate  in  a  medical  research  project.  In  addition,  the  regulation  covers 
compensation  for  persons  injured  during  the  course  of  the  project  and  confirmation  of  the 
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ethics  of  the  research  being  conducted. 

CURRENT  LEGAL  ISSUES  (1980  to  the  Present) 

There  is  little  dispute  that  biomedical  research  and  surveillance  activities  of  the  1980s 
were  greatly  affected  by  concerns  and  reactions  associated  with  the  HIV/AIDS  epidemic. 
All  the  old  issues  from  early  in  the  20th  century  reemerged  at  critical  levels:  Do  we 
want  to  treat  persons  for  the  disease,  or  do  we  want  to  modify  their  behavior  in 
control/prevention  efforts?  Is  the  physician's  primary  duty  to  protecting  a  patient's 
privacy  or  to  the  greater  good  of  society?  Is  the  public  health  machine  treading  on  the 
physician's  turf  by  advertising  and  providing  medical  treatment  more  inexpensively  than 
the  physician  can? 

Although  these  questions  still  need  to  be  answered  fully,  public  health  action  cannot  wait 
until  consensus  is  reached  before  constructing  and  applying  interventions.  The  sections 
below  examine  four  key  legal  issues  that  relate  to  these  questions  and  have  a  major  impact 
on  surveillance  in  the  1990s. 

Personal  Privacy 

The  right  of  an  individual  to  have  his/her  privacy  protected  under  the  law  is  a  vast  gray 
area.  The  U.S.  Constitution  does  not  specify  a  right  to  privacy,  although  particulars 
relating  to  the  protection  of  privacy  under  particular  circumstances  are  included  in  the 
Bill  of  Rights  (protection  from  "search  and  seizure,"  etc.).  As  noted  earlier  in  this 
chapter,  the  issue  of  right  to  privacy  and  the  physician's  role  in  protecting  that  privacy 
through  the  concept  of  privileged  communication  emerged  as  a  hotly  debated  issue  during 
the  war  on  sexually  transmitted  diseases  in  the  United  States  in  the  early  years  of  the 
20th  century.  The  concept  of  the  so-called  "medical  secret"  (6)  involved  the  dilemma  that 
faced  a  physician  whose  male  patient  had  a  sexually  transmitted  disease  (for  which  there 
was  no  sure  cure) ,  whose  reputation  the  physician  wished  to  spare,  but  whose  spouse  or 
future  spouse  was  at  risk  of  having  the  disease  if  the  physician  did  not  step  forward  and 
report  it.  Many  physicians  opted  to  remain  within  the  accepted  double  standard  of 
behavior  of  the  day  and,  according  to  Prince  Morrow,  became  "accomplices"  in  the  further 
transmission  of  infection  (7).  The  medical  secret  was  described  by  one  physician  as  a 
"blind  policy  of  protecting  the  guilty  at  the  expense  of  the  innocent,"  and  a  New  York 
attorney  ventured  the  opinion  that  "a  physician  who  knows  that  an  infected  patient  is 
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about  to  carry  his  contagion  to  a  pure  person,  and  perhaps  to  persons  unborn,  is  justified 
both  in  law  and  in  morals,  in  preventing  the  proposed  wrong  by  disclosing  his  knowledge 
if  no  other  way  is  open"  (7). 

Unfortunately,  the  right  to  privacy  issue  was  no  more  resolved  in  the  early  20th  century 
United  States  than  was  the  public  health  problem  created  by  the  nationwide  problem  of 
sexually  transmitted  diseases.  Public  health  officials  continue  to  struggle  with 
questions  associated  with  privacy  and  the  rights  of  the  individual  versus  the  good  of 
society  to  this  day . 

The  landmark  case  relating  to  the  right  of  an  individual  to  privacy  was  Griswold  vs. 
Connecticut,  381  U.S.  479  (1965),  which  resulted  from  the  arrest  of  the  director  of  the 
Planned  Parenthood  League  of  Connecticut  (Griswold)  on  the  grounds  that  she  had  provided 
information,  instruction,  and  medical  advice  about  contraception  to  married  people.  In 
Connecticut  at  the  time,  the  law  stated  that  the  use  of  contraceptives  was  punishable  by 
law.  Subsequently,  the  U.S.  Supreme  Court  declared  the  Connecticut  law  to  be 
unconstitutional  and  reversed  the  criminal  convictions  in  the  case.  In  the  majority 
opinion  written  for  the  Court  by  Justice  William  Douglas,  there  are  references  to  the  so- 
called  'penumbras0  or  auras  of  privacy  that  radiate  out  from  the  specific  rights  to 
privacy  stated  in  the  Bill  of  Rights.  He  observed  that  "various  guarantees  create  zones 
of  privacy"  (S).  He  went  on  to  say  that  the  Connecticut  law  exceeded  its  bounds  by 
seeking  to  regulate  the  use  of  contraceptive  devices  rather  than  their  manufacture  and/or 
sale.  The  only  means  he  could  postulate  for  enforcing  the  law  as  written  involved  the 
invasion  of  the  clearly  defined  zone  of  privacy  represented  by  marriage.  Lest  anyone 
misunderstand  his  meaning,  he  observed:  "Would  we  allow  the  police  to  search  to  sacred 
precincts  of  marital  bedrooms  for  tell  tale  signs  of  the  use  of  contraceptives?  The  very 
idea  is  repulsive  to  the  notions  of  privacy  surrounding  the  marriage  relationship"  (8)  . 

Later  courts  would  refer  to  this  constitutionally  recognized  right  of  the  individual  to 
privacy  in  certain  contexts  as  a  ■fundamental  interest."  In  the  precedent -setting 
abortion  case  of  Roe  v.  Wade,  410  U.S.  113  (1973),  a  single  woman  challenged  the 
constitutionality  of  a  Texas  law  forbidding  abortion  (except  when  the  pregnant  woman's 
life  was  in  jeopardy) .  She  claimed  that  this  law  denied  her  constitutional  right  to 
privacy  and  cited  the  earlier  opinions  of  the  Supreme  Court  relating  to  birth  control. 
Justice  Blackmun  observed  that  "the  state  does  have  an  important  and  legitimate  interest 
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in  preserving  and  protecting  the  health  of  the  pregnant  woman. . .  [and]  it  has  still  another 
important  and  legitimate  interest  in  protecting  the  potentiality  of  human  life.  These 
interests  are  separate  and  distinct.  Each  grows  in  substantiality  as  the  woman  approaches 
term  and,  at  a  point  during  pregnancy,  each  becomes  'compelling'"  (9). 

The  link  between  the  right  to  privacy  and  surveillance  is  also  related  to  The  Freedom  of 
Information  Act  (amended  1986)  .  In  essence,  the  latter  act  spells  out  the  situations  and 
conditions  pertaining  to  the  right  of  the  U.S.  taxpayer  to  obtain  information  s/he  has 
paid  for  from  agencies  within  the  Federal  Government.  Clearly,  there  is  the  potential 
for  conflicting  interests  in  such  situations,  if  information  about  taxpayer  A  is  released 
to  taxpayer  B.  The  act  takes  this  point  into  consideration  in  its  statement  that  "to  the 
extent  required  to  prevent  a  clearly  unwarranted  invasion  of  personal  privacy,  an  agency 
may  delete  identifying  details  when  it  makes  available  or  publishes  an  opinion,  statement 
of  policy,  interpretation,  or  staff  manual  or  instruction"  (10) . 

An  essential  aspect  in  designing  a  surveillance  program  is  the  assurance  to  the  persons 
(agencies)  who  report  and  those  being  reported  upon  that  the  privacy  rights  of  the  persons 
whose  health  information  is  of  interest  will  not  be  violated.  The  conflict  created  by 
the  "right  to  privacy"  and  the  "need  to  know"  represents  an  area  that  must  be  monitored 
by  the  managers  of  a  surveillance  program  as  diligently  as  they  monitor  the  health 
conditions  to  be  reported.  To  illustrate:  One  of  the  most  important  court  decisions  the 
Centers  for  Disease  Control  (CDC)  has  obtained  in  recent  years  related  to  litigation 
arising  out  of  the  epidemic  of  toxic-shock  syndrome  of  the  late  1970s  and  early  1980s. 
The  attorneys  representing  the  manufacturer  of  the  tampon  that  had  been  strongly 
statistically  associated  with  the  occurrence  of  toxic  shock  syndrome  wanted  to  obtain  not 
only  data  about  women  who  had  had  toxic  shock  syndrome  and  from  whom  CDC  had  collected 
information  but  the  names  of  the  women  as  well.  The  agency  argued  (through  district  court 
and  up  to  the  Federal  Court  of  Appeals)  that  participation  in  federal  surveillance  is 
voluntary  and  that  participants  in  such  programs  have  a  reasonable  expectation  that  their 
confidentiality  will  be  protected  by  the  Federal  Government.  The  Appeals  Court  ruled  in 
CDC's  favor,  but  this  position  will  continue  to  be  challenged  on  a  "need  to  know"  basis, 
and  persons  who  are  designing  and  operating  surveillance  systems  should  always  keep  in 
mind  the  specter  of  the  forced  divulgence  of  information  they  have  assured  participants 
would  be  confidential.  This  is  particularly  likely  in  situations  involving  litigation, 
because  of  the  courts'  strong  bias  to  make  available  the  same  information  to  legal 
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representatives  for  both  plaintiffs  and  defendants. 

The  final  observation  in  this  section  is  that  the  manager  of  a  surveillance  program,  at 
least  within  a  federal  agency,  is  always  in  danger  of  being  accused  by  the  popular  media 
or  the  legal  community  of  hiding  something  deliberately--not  to  protect  the  privacy  of 
individuals,  but  for  sinister  reasons  that  are  usually  hinted  at  but  not  stated.  This 
sort  of  accusation  may  have  no  basis  in  fact,  but  must  be  taken  seriously  and  generally 
requires,  at  a  minimum,  an  undesirable  outlay  of  energy  and  worry  on  the  part  of  the 
surveillance  program  manager. 

Right  of  Access 

If  the  taxpayers  support  the  gathering  of  information,  they  have  a  right  to  that 
information  (12).  This  statement  forms  one  basis  for  the  "right  to  access"  position. 
Both  the  Privacy  Act  and  the  Freedom  of  Information  Act  reflect  the  post-Watergate  era, 
with  its  focused  concern  on  the  potential  for  the  government  to  keep  secret  files 
containing  information  on  individuals.  Beyond  that  is  the  "reasonable  man"  position, 
which  maintains  that  a  person  has  a  right  to  any  information  that  is  about  him/her. 
Unfortunately,  giving  information  to  an  individual  about  himself /herself  can  sometimes 
have  the  effect  of  providing  information  that  assigns  liability  to  another  person  (or 
organization)  in  the  data  set.  So  even  the  process  of  providing  personal  information  to 
the  person  in  question  is  not  without  its  hazards. 

In  addition  to  the  individuals  who  wish  to  obtain  information  about  themselves,  there  are 
the  so-called  "third-party"  inquirers.  These  individuals  call  for  information  on  a  need- 
to-know  basis  and  may  range  from  members  of  the  U.S.  Congress  through  attorneys  and 
special-interest  groups  (e.g.,  "right  to  life"  or  "pro-choice"  groups)  to  representatives 
of  the  news  media. 

A  major  point  for  the  surveillance  program  manager  to  ponder  is  when  to  make  a  public-use 
data  set.  Although  there  is  no  legal  precedent  to  be  followed  here,  once  the  first  paper 
has  been  published  about  a  data  set,  it  is  prudent  to  place  that  data  set  in  the  public 
domain  if  there  is  a  reasonable  expectation  of  its  further  use.  Although  this  creates 
the  risk  of  extra  work  and  having  others  preempt  publication,  it  obviates  accusations 
about  willful  withholding  of  information  or  the  danger  that  forced  release  of  data  before 
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they  are  properly  prepared  for  public  use  will  allow  some  subjects  to  be  identified. 

Product  Liability 

This  heading  could  be  'Research  Institution  Discovers  Corporate  America—and  Vice  Versa.' 
The  issue  has  been  around  for  many  years  but  seemed  to  rise  to  prominence  in  the  United 
States  with  the  emergence  of  toxic-shock  syndrome  in  the  late  1970s  and  early  1980s.  It 
is  not  unusual  for  investigations  to  show  that  a  product  is  contaminated,  that  someone 
used  a  machine  incorrectly,  or  even  that  someone  deliberately  tampered  with  a  medication 
or  device  and  caused  illness  or  death.  What  was  not  familiar  was  that  a  "good"  product, 
one  that  meets  all  its  quality-control  specifications  and  does  what  it  is  advertised  to 
do,  can  also  have  effects  that  are  less  than  desirable.  Thus,  no  one  was  ready  to  deal 
with  the  situation  in  which  an  efficiently  designed  tampon  apparently  led  to  a  life- 
threatening  illness .  The  scientists  had  to  accept  the  findings  because  scientists  deal 
in  fact  (probability) ,  and  the  media  had  grist  for  their  mills,  but  the  manufacturer  of 
the  tampon  (and  its  employees  and  stockholders  and  legal  representatives)  did  not  have 
an  easy  time  coping  with  "the  facts.'  In  fact,  they  underwent  a  classic  grief  reaction-- 
which  the  staff  at  CDC  and  other  health  science  agencies  have  since  learned  to  anticipate 
and  to  recognize- -involving  the  stages  of  denial,  anger,  depression,  acceptance,  and 
resolution.  Human  nature  was  applied  with  a  vengeance,  and  the  first  three  stages  were 
immediate,  intense,  and  enduring.  The  last  two  stages  took  some  time  and  extensive  effort 
to  induce. 

Ideally,  one  should  assure  that  surveillance  programs  are  flawless  and  that  all  the 
information  reported  is  unassailable.  In  the  world  of  public  health  practice,  such 
Utopian  standards  can  rarely  be  met.  And  public  health  practitioners  must  continue  to 
be  prepared  to  deal  with  issues  on  a  mixture  of  levels- -including  public  health,  legal, 
ethical,  socio-cultural,  and  emotional  components. 

Litigation  Demands 

Under  litigation  demands,  the  issue  is  to  what  extent  an  agency  is  responsible  for 
providing  its  staff  to  testify  in  litigation  relating  to  findings  it  obtained  through 
surveillance  or  research.  Of  course,  there  is  no  simple  answer,  just  as  there  have  not 
been  any  simple  answers  to  the  other  questions  posed  in  this  chapter.  Clearly,  it  is  not 
responsible  to  refuse  to  provide  expert  testimony  in  any  instance  in  which  it  is 


242 

solicited.  In  some  cases,  agency  scientists  may  be  the  only  ones  who  have  worked  in  the 
area  in  question  and  have  facts  to  cite.  By  the  same  token,  in  situations  in  which  there 
are  massive  numbers  of  suits  being  conducted  over  a  period  of  several  years  (as  with 
toxic-shock  syndrome  or  transfusion-associated  HIV  infection),  all  of  the  scientific 
resources  of  an  agency  could  be  expended  on  time  in  court  and,  therefore,  none  of  them 
on  the  science  that  is  their  primary  business.  Somewhere,  there  is  a  correct  answer  for 
each  agency  and  each  health  issue,  and  this  problem  may  need  to  be  faced  when  planning 
surveillance  activities. 

CONCLUSION 

For  those  who  set  up  and  run  surveillance  programs,  it  is  important  to  note  the  following 
summary  comments.  Public  health  surveillance  systems  operate  in  the  massive  goldfish  bowl 
that  encompasses  both  public  health  practice  and  the  law. 

•  Plan  and  design  surveillance  systems  so  that  they  are  most  likely  to  provide 
all  the  information  and  only  the  information  actually  needed. 

•  Include  as  few  personal  identifiers  as  feasible. 

•  Analyze  and  publish  data  in  a  responsible  and  timely  fashion. 

•  Be  prepared  to  stand  behind  the  results  (and  hope  your  agency  will  stand 
behind  you) . 

•  Be  prepared  to  place  each  data  set  in  the  public  domain  as  soon  as  the  first 
results  are  published. 

•  If  the  findings  are  revolutionary,  be  prepared  for  a  hostile  reaction 
rather  than  a  medal . 

•  Finally,  remember  that  the  individual  has  rights  (to  privacy,  to  access 
information,  to  participate  or  not  to  participate  in  surveillance  programs, 
and  the  like) .  The  public  health  practitioner,  at  least  in  the  role  of 
public  health  practitioner,  has  no  rights--only  responsibilities. 

Public  surveillance  constitutes  one  of  the  bridges  between  what  we  think  is  happening  and 
what  is  actually  happening.  As  such,  it  is  one  of  the  most  valuable  tools  of  the  public 
health  practitioner.  With  surveillance  data  as  the  light  bulb  and  the  law  as  a  rheostat 
that  stimulates  change  and  regulates  behavior,  the  two  areas  can  work  in  concert  to 
improve  the  quality  of  the  public's  health. 
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Chapter  XI 


Computerizing  Public  Health 
Surveillance   Systems 


Andrew  6.  Dean 

Robert  F.  Fagan 

Barbara  Panter - Connah 


•We  only  conquer  what  we  wholly  assimilate." 


Andr<§  Gide 


In  this  chapter  on  informatics  or  computerization  of  surveillance  systems,  we  will 
first  explore  what  is  technically  possible  in  computerization  of  surveillance,  finding 


246 

an  enormous  gap  between  this  and  the  best  of  today's  actual  systems.   The  barriers  to 

optimal  use  of  computers  in  surveillance — mostly  social,  organization,  and  legal 

are  explored.  The  remainder  of  the  chapter  explores  some  of  the  problems  that  must  be 
confronted  in  thinking  about  microcomputer-based  surveillance,  leaning  heavily  on 
examples  from  the  notifiable  disease  system  in  the  United  States. 

OVERVIEW   OF   A   SURVEILLANCE    SYSTEM    IN   THE    FUTURE 
An  Ideal  Surveillance  System 

Ideally  the  epidemiologist  of  the  future  will  have  a  computer  and  communications 
system  capable  of  providing  management  information  on  all  these  phases  and  also 
capable  of  being  connected  to  individual  households  and  medical  facilities  to  obtain 
additional  information. 

Suppose  that  the  epidemiologist  of  the  future  has  a  computer  with  automatic  input  from 
all  inpatient  and  outpatient  medical  facilities,  with  standard  records  for  each  office 
or  clinic  visit  and  each  hospital  admission.   S/he  chooses  to  compare  today  or  this 
week  with  a  desired  period,  perhaps  the  past  5  years,  and  the  computer  displays  or 
prints  a  series  of  maps  for  all  conditions  with  unusual  patterns.  One  of  the  maps 
seems  interesting,  and  the  epidemiologist  may  point  to  a  particular  area  and  request 
more  information.   A  more  detailed  map  of  the  area  appears,  showing  the  data  sources 
that  might  provide  the  desired  information,  with  estimates  of  the  cost  of  obtaining 
the  items  desired.   A  few  clicks  of  the  mouse  button  select  the  sources,  types  of 
data,  and  format  for  a  display,  and  the  computer  spends  a  few  minutes  interacting  with 
computers  in  the  medical  facilities  involved- -extracting  information  and  paying  the 
necessary  charges  from  the  epidemiology  division's  budget.  Soon  the  more  detailed 
information  is  displayed  on  the  epidemiologist's  computer  screen. 

The  pattern  of  hospitalizations  and  outpatient  visits  for  asthma  stands  out,  and  the 
epidemiologist  requests  a  random  sample  of  specified  size  of  persons  who  have  ever  had 
asthma  in  the  same  area,  matched  by  age  and  gender,  to  serve  as  controls  for  a  case- 
control  study.  The  video-cable  addresses  of  these  "controls"  and  of  the  case-patients 
are  quickly  produced  through  queries  to  appropriate  local  medical-information  sources. 
The  epidemiologist  formulates  several  questions  about  recent  experiences,  types  of  air 
conditioning,  visits  to  various  public  facilities,  and  the  like,  adapts  these  to  a 
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previously  tested  video  questionnaire  format,  and  requests  that  video  interviews  be 
performed  for  case-patients  and  controls.   Each  household  is  contacted  or  left  a  FAX- 
like  request  to  tune  to  a  particular  channel  and  answer  a  5-minute  query  from  the 
state  health  department  on  a  matter  of  importance  to  public  health.  Eighty-five 
percent  of  the  subjects  respond  to  the  first  query,  and  the  computer  automatically 
follows  up  with  the  rest,  bringing  the  response  to  92%,  with  half  of  the  remainder 
reported  to  be  absent  from  their  homes  for  at  least  2  days. 

The  odds  ratio  for  persons  with  recent  hospitalizations  for  asthma  who  work  in  or 
visit  in  a  particular  neighborhood  is  considerably  higher  than  1.0,  and  the 
epidemiologist  connects  by  local-area  network  to  the  state  occupational  surveillance 
system  and  requests  a  display  of  all  factories  in  the  relevant  area.  Selecting  those 
that  deal  with  possibly  allergenic  materials,  s/he  issues  a  request  for  more  detailed 
investigation  of  activities  at  the  plants  in  a  selected  time  interval .  The 
epidemiologist  also  requests  information  from  the  weather  bureau  on  wind  direction  and 
velocity,  temperature,  and  rainfall. 

Within  a  few  hours,  a  plant  is  identified  that  is  in  the  process  of  moving  a  large 
pile  of  by-products  with  a  bulldozer.  A  request  is  issued  that  the  by-product  be 
sprayed  with  water  to  prevent  its  particles  from  becoming  airborne,  and  the  plant 
manager  readily  agrees  when  shown  the  maps  that  depict  hospitalization  rates  for 
asthma  downwind  from  the  plant.  To  monitor  progress  and  widen  the  investigation,  the 
epidemiologist  asks  the  computer  to  do  similar  studies  for  conjunctivitis  and  for 
coryza  or  hay  fever  over  the  previous  and  next  2  weeks.  Selecting  several  maps  and 
tables  to  include  in  the  report,  s/he  asks  the  computer  to  write  a  description  of  the 
studies  performed  and  the  findings,  and  then  dictates  a  brief  summary  of  the  problem 
and  several  follow-up  notes  to  the  voice  port  of  the  computer.  At  the  end  of  2  weeks, 
the  number  of  cases  of  asthma  has  fallen  to  normal  for  the  area,  and  the  computer 
calculates  on  the  basis  of  the  number  of  medical  visits  during  the  outbreak  that 
$55,000  has  been  saved  at  a  total  cost  of  a  few  hours  of  the  epidemiologist's  effort, 
a  site  visit  to  the  plant,  and  charges  of  $9,500  for  the  data  and  the  communication 
facilities  used  to  perform  the  interviews. 

Barriers  to  the  Ideal  Surveillance  System 
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Obviously,  we  are  a  long  way  from  implementing  the  system  described  above.   It  may  be 
helpful  in  thinking  about  the  future  to  explore  what  barriers  must  be  surmounted 
before  this  scenario  can  be  enacted.  Strangely  enough,  few  of  them  are  technical;  all 
of  the  necessary  systems  could  be  built  today  with  fairly  conventional  equipment  and 
software,  with  the  exception  of  the  two-way  interactive  video  connection  with  each 
household.   This  hook-up  with  the  individual  household  is  more  likely  to  be  available 
within  the  next  10  years  than  is  the  connection  between  the  physician's  record  files 
and  the  health  department.   In  fact,  the  two-way  interactive  video  link  between  the 
household  and  the  outside  world  is  simply  awaiting  the  government's  or  the 
marketplace's  decision  on  what  format  will  be  used  and  on  the  realization  of  the 
benefits  of  such  a  connection  on  the  part  of  the  entrepreneurs  and  the  public. 

However,  there  are  some  difficult  problems  to  be  solved  before  the  'ideal  system"  can 
be  implemented.   They  include  the  following: 

a)     The  rapid  availability  of  standardized,    computerized  medical 

records.      Several  issues  need  to  be  addressed  before  such  a  system  is 
possible.   In  the  United  States,  for  example,  a  profusion  of  computerized 
medical-record  systems  for  inpatient  and  outpatient  records  as  well  as 
insurance  and  other  purposes  have  been  developed  These  systems  contain  a 
plethora  of  different  variables  and  use  many  different  formats.   Until  a 
simple  core  public  health  record  of  age,  gender,  geographic  location, 
diagnosis,  and  a  few  other  items  is  created  for  each  outpatient  visit  and 
each  hospitalization- -and  is  available  in  a  standard  format  without 
delay--the  responsive  interactive  system  above  remains  an  unrealistic 
pipe  dream.   An  additional  problem  is  that  most  medical  records  are  still 
not  more  than  partially  computerized. 

The  barriers  to  establishing  standardized  public  health  output  from 
computerized  medical  records  are  primarily  political  and  administrative; 
most  large  retail  organizations  create  records  of  similar  size  for  each 
item  sold,  and  the  items  carry  on  average,  a  much  lower  price  than  the 
cost  of  a  visit  for  medical  care.   Once  there  is  the  will  to  establish  a 
national  computerized  medical  record  system,  the  technical  hurdles  will 
be  readily  overcome.  The  needs  include  standard  but  suitably  flexible 
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record  formats,  solutions  to  problems  associated  with  confidentiality, 
incentives  to  create  the  records  (including  the  assurance  of  appropriate 
and  cost  effective  use  of  the  records),  and  voice  output. 

b)  Another  problem  is  the  lack  of  recognition  that  information  about 
patients,  except  for  legally  designated  "reportable  diseases, ■  is  useful 
in  public  health  and  should  be  available  to  public  health  agencies.  The 
level  of  awareness  could  be  heightened  if  technical  solutions  to  problems 
of  confidentiality  were  publicized  and  understood  by  the  public  and  their 
legislative  representatives.  Such  solutions  as  one-way  encoding 
algorithms  could  provide  partial  solutions  to  matching  and  follow  up 
problems,  if  properly  used  without  turning  public  health  agencies  into 
carbon  copies  of  dreaded  "big  brother." 

c)  A  pervasive  feeling  among  those  in  charge  of  data  that  their  data  base 
must  be  "clean"  before  anyone  else  can  use  it.  Months  or  even  years  are 
consumed  while  corrections  and  updates  are  made  to  make  the  data  as 
accurate  as  possible.   Although  from  one  perspective  this  quality  control 
is  necessary  and  important,  the  concept  of  "surveillance"  includes  rapid 
turnaround,  a  realization  on  the  part  of  everyone  concerned  (even  the 
media  and  the  public)  that  the  data  are  preliminary,  and  the 
understanding  that  in  order  to  look  at  today's  data  today,  one  must  be 
willing  to  accept  today's  imperfections.  This  mental  shift,  as  well  as 
corresponding  technical  developments,  will  be  necessary  before  a 
computerized  system  can  be  used  to  examine  automatically  a  "time  slice" 
of  disease  and  injury  records  that  originate  in  clinics  and  hospitals. 
Imperfections  will  be  everywhere,  and  methods  must  be  found  to  cope  with 
reality--even  if  it  includes  warts--on  an  immediate  basis. 

The  Technology  of  the  Future 

As  stated  above,  today's  technology,  given  enough  social  and  organizational 
development,  is  adequate  to  allow  the  creation  of  miracles  in  public  health 
information  and  communication.  Nevertheless,  it  seems  likely  that  development  in 
technology  will  continue  to  reflect  more  of  a  driving  force  in  public  health  computing 
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than  progress  in  political  and  social  organization. 

Technologic  developments  over  the  next  decade  will  probably  include  the  areas  shown 
below: 

High  capacity  storage  devices 

CD  ROM's  (compact  disk  read  only  memory)  similar  to  those  used  for  music  make  it 
possible  to  have  access  to  large  bibliographic  data  bases  anywhere  there  is 
electricity.   The  MEDLARS  data  base  of  the  U.S.  National  Library  of  Medicine  can  be 
searched  from  a  clinic  in  Africa;  (once  there  are  lower  prices  for  books  on  CD  ROM  and 
they  include  needed  illustrations),  it  will  be  possible  to  take  a  medical  library 
anywhere  in  a  briefcase.   Past  data  bases  from  the  United  States  and  elsewhere  will 
become  available  on  CD  ROM,  although  the  process  of  cleaning  them  up  for  this  purpose 
often  reveals  gaps  and  inconsistencies  that  reflect  changing  definitions  and  diminish 
their  value  as  consistent  anchors  for  comparison. 

Networks 

A  local  area  network  (LAN)  is  a  system  linking  microcomputers,  terminals,  workstations 
with  each  other  and/or  a  mainframe  computer  to  facilitate  sharing  of  equipment  (e.g., 
printers)  programs,  data,  or  other  information.   LANs  are  transforming  the  way  many 
agencies  do  business.  The  most  noticeable  effect  is  the  transmission  of  written 
memoranda  that  could  or  would  not  have  been  typed,  packaged,  and  sent  through  a  paper 
system.  The  cost  of  installing  and  supporting  a  LAN  is  not  small,  particularly  in 
terms  of  support  personnel.  Uses  for  surveillance  include  entering  data  at  multiple 
computers  connected  by  a  LAN.   This  requires  special  software  to  protect  against 
errors.  Special  precautions  to  protect  confidentiality  are  necessary  in  a  network,  if 
several  people  enter  data  in  the  same  file  at  the  same  time. 

New  user  interfaces 

The  parts  of  programs  that  interact  with  users  have  become  easier  to  understand,  and 
more  attractive,  with  pull-down  menus,  windows,  and  pointing  devices  such  as  the 
■mouse."  This  elegance  has  its  cost  in  terms  of  requirements  for  faster  computers, 
for  more  memory,  and  particularly  for  greater  skill  to  produce  such  programs.  Some 
new  programs  cause  unexpected  problems  when  run  with  older  programs  or  on  older 
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computers.   All  in  all,  the  trend  is  toward  a  standard  set  of  screen  "controls,"  like 
those  in  modern  cars,  but  the  path  in  that  direction  is  replete  with  experiment  and 
minor  failures. 

New  programming  tools 

It  is  widely  recognized  that  software  production  is  the  narrow  point  in  the 
implementation  of  new  ideas  in  computing.  Useful  software  still  requires  hundreds  of 
thousands  of  lines  of  hand-written  and  highly  personal  "coding."  Many  new  trends  such 
as  "fourth-generation  data  bases,"  computer-assisted  software  design  (CASE)  tools,  and 
■object-oriented  design"  have  made  programming  more  productive,  but  this  area  of  new 
tools  is  one  in  which  major  advances  would  create  revolutionary  changes. 

Higher-capacity  processors  and  more  memory 

The  almost  miraculous  advances  in  computer  speed  and  memory  capacity  in  the  last 
decade  have  removed  many  of  the  limits  that  required  use  of  mainframe  computers  or 
minicomputers  rather  than  microcomputers.   Now  almost  any  project  can  be  done  on  a 
microcomputer  or  several  microcomputers  connected  by  a  LAN  if  there  is  sufficient 
motivation. 

Video  and  computer  Integration 

Photographs  and  fully  functional  video  will  soon  be  appearing  on  our  computer  screens. 
Although  this  may  have  greatest  impact  in  pathology  and  radiology,  and  education,  it 
also  alters  on  opportunities  to  use  color  and  three-dimensional  dynamic  displays  for 
epidemiologic  data.   The  possibilities  for  computer  interaction  via  ordinary 
television  sets  are  exciting,  because  every  epidemiologist  (and  market  researcher)  can 
savor  the  possibility  of  interviewing  citizens  via  cable  television  with  the  results 
captured  immediately  in  computerized  form.  The  medium  offers  new  challenges  in 
identifying  responses  that  result  from  the  various  stages  of  humor,  exasperation,  or 
intoxication  that  citizens  may  undergo  in  the  privacy  of  their  homes. 

Voice  and  pen  input 

System  are  available  now  that  identify  thousands  of  spoken  words  (for  tens  of 
thousands  of  dollars)  and  allow  for  a  crude  interaction  between  voice  and  computer. 
Computers  that  recognize  handwritten  text  of  reasonably  structured  type  are  being  sold 
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currently.   Presumably  the  rather  elementary  state  of  computerization  of  medical 
records  will  undergo  a  quantum  leap  once  such  systems  allow  medical  staff  to  dictate 
to  the  computer  without  typing  and  preferably  without  being  near  a  computer.  When 
medical  handwriting  is  replaced  by  voice  dictation  into  a  lapel  microphone,  real 
progress  may  occur  in  the  use  of  computers  in  both  clinical  medicine  and  public  health 
settings.   As  stated  above,  however,  realizing  real  public-health  benefit  from  such 
technology  will  require  dramatic  social  and  legal  changes. 

BACK  TO  THE  PRESENT:  COMPUTERIZED  PUBLIC  HEALTH 
SURVEILLANCE  IN  1992 

Since  1985,  Centers  for  Disease  Control  (CDC)  staff  have  installed  and  maintained 
customized  disease-surveillance  software  in  36  state  health  departments  and  a  number 
of  county,  district,  and  territorial  departments.  The  software  has  been  based  on  Epi 
Info,    a  public-domain  word-processing,  database,  and  statistics  package  for  IBM- 
compatible  microcomputers  that  is  a  joint  product  of  CDC  and  the  Global  Programme  on 
AIDS,  World  Health  Organization  {1,2).     These  systems  have  made  possible  the 
participation  of  all  50  states  in  the  National  Electronic  Telecommunications 
Surveillance  System  (3,4).     Benefits  cited  in  a  recent  evaluation  include  improved 
access  to  data  and  improvement  in  both  quality  of  data  and  access  associated  with 
decentralized  entry  of  data  (5) . 

Although  reportable-disease  systems  are  a  specific  kind  of  surveillance  system  and  Epi 
Info   is  only  one  type  of  data-base/statistics  program  around  which  a  system  can  be 
built,  many  of  the  principles  of  computerization  apply  to  other  systems.  To  avoid 
empty  generalization,  much  of  the  rest  of  this  chapter  is  based  on  CDC's  experience 
with  reportable-disease  surveillance  using  Epi  Info.      The  information  is  directed  to 
those  considering  computerization  of  a  disease-surveillance  or  similar  system  of 
records,  whether  they  wish  to  do  their  own  system  design  or  will  be  working  with  a 
professional  computer-systems  designer.  Computerizing  a  surveillance  system  for 
disease  is  not  easy.  Since  the  success  of  computerization  depends  as  much  on  the 
administrative  and  epidemiologic  environment  as  on  the  software,  it  is  vital  that 
public  health  practitioners  understand  the  details  of  a  new  system  and  participate  in 
its  design.   The  most  important  step  in  developing  a  computerized  surveillance  system 
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is  identifying  the  public  health  objective  for  the  system.   In  some  cases,  the 
objective (s)  will  have  been  clear  for  decades  in  a  manual  system  ('Identify  and  treat 
or  isolate  cases  of  X  and  evaluate  results, "  or  "Assess  results  of  immunization 
programs  and  identify  new  cases  for  special  control  efforts").   Computerization  can 
then  be  directed  toward  accomplishing  the  same  task  more  efficiently  or  in  greater 
volume  or  detail. 

The  most  successful  computer  systems,  however,  are  those  that  change  methods  by  which 
an  agency  operates  rather  than  those  that  merely  automate  a  manual  task  ( 6) .   In 
establishing  a  new  surveillance  system  or  reexamining  an  existing  system,  it  may  be 
useful  to  address  the  following  question:   "What  key  pieces  of  information  do  I  want 
to  see  on  my  desk  (or  computer  screen)  every  day,  week,  month,  or  year  that  will  make 
my  work  easier  or  more  effective?-   The  same  question  can  be  asked  at  several  levels 
of  management- -from  epidemiologic  technician  to  epidemiologist  to  director  of  a  public 
health  agency . 

Given  a  surveillance  system  that  has  a  public  health  goal  and  to  some  extent  achieves 
the  goal,  why  computerize?  Sometimes  the  answer  is  obvious--because  the  annual  report 
takes  a  herd  of  clerks  2  years  to  process,"  or  "we  like  the  graphs  health  department  A 
turns  out  so  easily  with  their  computer."   Potential  benefits  relate  to  quality  of 
data  or  of  reports,  quantity  of  data  that  can  be  processed,  and  speed  of  processing. 
Dissemination  (copying)  of  surveillance  records  to  another  site  is  one  reason  disease 
reports  in  all  50  U.S.  states  are  computerized. 

We  were  unable  to  find  systematic  studies  on  the  benefits  of  computerizing  public 
health  surveillance  systems,  although  numerous  articles  describe  individual  systems 
that  have  been  computerized  (7-10),  and  Gaynes  et  al .    (21)  describe  methods  for 
evaluating  a  computerized  surveillance  system.  In  literature  about  the  commercial 
world,  benefits  of  computerization  have  been  examined  from  the  viewpoint  of  financial 
savings.   Savings  by  automating  a  manual  information  process  may  amount  to  20%  or  so, 
but  the  real  benefits  are  achieved  if  computerization  transforms  the  entire  process 
concerned,  giving  a  competitive  advantage  in  the  commercial  world—which  would 
correspond  to  a  new  order  of  service  in  the  public  health  world  (6)  .  So  far,  most 
public  health  applications  have  automated  manual  systems,  although  some--such  as  the 
spreadsheet  calculation  of  the  impact  of  smoking  on  populations--verge  on  establishing 
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new  and  previously  unknown  styles  of  doing  business  (12)  . 

One  problem  cited  in  other  "vertical  markets"  (industries  with  specialized 
practitioners)  such  as  the  construction,  meat-packing,  and  real  estate  industries. 
With  only  7,000  epidemiologists  in  the  United  States,  relatively  few  commercial 
developers  feel  that  it  is  financially  worthwhile  to  develop  software  for  this  market 
alone,  since  applications  such  as  spreadsheets,  languages,  and  word  processors  may 
sell  millions  of  copies  to  the  general  public  (13)  . 

Basic  Needs 

The  first  requisite  for  computerization  is  a  paper  system  or  operational  design  that 
works  reasonably  well  or  would  do  so  if  the  process  were  speedier  and  more  accurate. 
Chaos  computerized  is  not  necessarily  an  improvement  over  what  is  already  in  place, 
although  the  process  of  computerization  offers  a  chance  to  rethink  some  of  the 
features  of  a  system  and  to  make  improvements.   If  the  surveillance  system  is  a  new 
one,  it  may  be  desirable  to  evolve  the  computer  facilities  in  small  stages  with 
minimal  investment  until  the  system  proves  to  be  useful  and  well-conceived.   This 
requires  a  careful  plan  (including  provision  for  changing  the  plan  if  necessary)  but 
will  minimize  the  expense  of  adaptation  as  the  epidemiologic  design  of  the  system 
undergoes  the  inevitable  adaptation  to  external  reality.  After  the  "bare  bones" 
system  has  proven  its  worth  and  the  probability  of  expensive  changes  is  lower,  the 
"bells  and  whistles"  can  be  added  later. 

Personnel  to  do  the  collection  of  data,  data  entry,  analysis,  and  system  maintenance 
are  important  contributors  to  the  system.   Many  of  the  tasks  can  be  learned  by  current 
employees,  particularly  if  they  find  this  challenge  welcome.  If  possible,  those 
chosen  should  be  long-term  employees  to  assure  stability  of  the  system,  although  they 
may  be  aided  by  students  and  other  temporary  employees.   The  epidemiologist  who  will 
use  the  results  should  participate  in  the  planning  of  the  system  and  should  understand 
how  it  is  constructed.   A  staff  member  with  some  programming  skills  and/or  aptitude 
for  microcomputing  should  be  involved  in  designing  and  setting  up  the  system,  even  if 
an  outside  consultant  does  the  actual  programming. 

If  several  computers  are  to  interact  and  share  data,  a  set  of  standards  is  necessary 
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(e.g.,  just  as  humans  carrying  on  a  conversation  need  a  common  language).   In  the 
United  States,  the  states  and  CDC  chose  a  standard  record  format  so  that  computers  of 
different  types  could  reformat  data  to  a  set  of  standard  records  and  send  these  to  the 
central  agency.   This  standard,  first  devised  in  1984  and  revised  in  1991,  has  served 
the  purpose  well,  without  placing  unnecessary  restrictions  on  the  type  of  hardware  or 
the  format  of  records  kept  within  each  state.  One  state  maintains  20  times  more 
information  for  local  use  than  do  other  states,  but  all  export  the  same  standard 
record  formats  to  the  national  level.  The  new  standard  record  format  allows  for 
standard  demographic  and  diagnostic  information,  attachment  of  variable -length 
detailed  reports  for  selected  diseases,  mixture  of  summary  with  individual  records, 
and  automatic  comparison  of  state  and  national  data  bases  with  each  transmission. 

Most  government  settings  have  an  organization  in  charge  of  computer  programming, 
approval  of  new  systems,  and  purchasing  of  computers  and  software.   It  is  important  to 
maintain  liaison  with  this  organization  and  to  arrange  its  assistance  ahead  of  time 
with  difficult  areas  such  as  purchasing  computers.   In  some  organizations,  purchases 
are  limited  to  particular  types  of  computers- -occasionally  with  unique 
characteristics--or  to  centrally  administered  systems.  We  recently  encountered  a 
network  of  "diskless"  workstations  that  presented  numerous  problems  in  trying  to  load 
or  run  software  or  back-up  files  from  a  particular  station  without  a  removable  storage 
device.   If  such  problems  are  present,  it  is  prudent  to  discover  and,  if  possible,  to 
surmount  them  at  an  early  stage  through  patient  negotiation  and  collaboration  or  other 
methods  if  necessary.  The  technical  difficulties  that  arise  in  setting  up  a  computer 
system  are  usually  the  easy  problems;  the  difficulties  that  lead  to  months  and  years 
of  delay  and  unhappiness  usually  reflect  misunderstanding  and  miscommunication  among 
individuals  or  organizational  entities. 

Some  Key  Concepts;   Files,  Records,  and  Fields 

Computerized  records  are  stored  in  files.  A  file  is  a  collection  of  records,  usually 
one  record  per  case,  that  has  a  name  (e.g.,  GEPI.REC,  for  General  EPIdemiology)  and 
can  be  manipulated  as  a  unit.  Files,  like  books,  can  be  opened,  closed,  read,  written 
to,  or  discarded.  They  are  stored  on  nonvolatile  media  such  as  hard  or  floppy  disks 
or  magnetic  tape. 
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Records  correspond  to  one  copy  of  a  completed  questionnaire  or  form,  such  as  a 

disease-report  card.   Usually,  one  disease  report  or  questionnaire  is  stored  in  a  file 

as  a  single  record.  Records  can  be  displayed  on  the  screen,  searched  for  by  name  or 

some  other  characteristic,  saved  (written)  to  a  disk,  or  marked  as  deleted.  Many 
records  can  be  stored  in  each  file. 

A  field  is  one  item  of  information  within  a  record.   NAME,  AGE,  and  DATEONSET  might  be 
fields  within  a  disease-report  record.   Records  in  a  particular  file  all  have  the  same 
fields.  Each  field  has  a  name,  a  type  (text,  upper-case  text,  numeric,  date,  etc.), 
and  a  length,  such  as  22  characters  for  NAME  or  3  for  AGE.   During  analysis,  fields 
may  be  called  variables,  and  commands  such  as  "TABLES  DISEASE  COUNTY"  are  used  to 
instruct  the  system  to  process  a  particular  file  and  construct  the  desired  table  by 
tabulating  the  fields  or  variables  called  DISEASE  and  COUNTY.   In  this  case,  the 
result  in  Epi  Info   would  be  a  table  that  lists  DISEASE  down  the  left  side  and  COUNTY 
across  the  top,  with  numbers  of  reports  by  county  indicated  in  the  cells  of  the  table. 

Hardware:  What  Size  Computer  is  Appropriate? 

With  microcomputers  being  available  for  much  less  than  $5000,  it  is  possible  to 
process  more  than  100,000  records  in  reasonable  time  periods.   Processing  time  tends 
to  reflect  the  record  length  as  well  as  the  number  of  records,  however,  and  the  size 
of  each  record  should  be  kept  short  if  large  numbers  will  be  processed.   Since  the 
total  number  of  disease  reports  for  the  United  States  is  several  hundred  thousand  per 
year,  states  and  counties  should  find  it  possible  to  build  most  systems  on  a 
microcomputer  if  desired. 

Minicomputers  and  mainframes  can  serve  as  the  basis  for  surveillance  systems  if 
available  at  reasonable  cost  and  if  programming  and  support  staff  are  available  to 
work  creatively  with  staff  of  the  surveillance  system.   The  greater  technical  skill 
required  to  run  and  program  such  computers  often  resides  in  an  organization  other  than 
the  one  running  the  surveillance  system,  and  close  coordination  becomes  much  more 
important  than  in  the  do-it-yourself  situation  with  a  microcomputer. 

Systems  that  seem  to  require  processing  of  millions  of  records,  such  as  hospital 
discharge  or  Medicare  records  for  a  state,  can  be  reduced  by  sampling  to  a  manageable 
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size  for  the  microcomputer.   The  mainframe  can  be  used  to  select  a  sample  of  records 
(e.g.,  particular  age  groups,  diseases,  every  tenth  record,  or  persons  born  in  decade 
years).  Files  are  then  exported  for  processing  on  a  microcomputer  that  is  more 
responsive  to  the  epidemiologist's  wishes.  Epidemiologists  are  usually  acutely 
conscious  of  sample  size  when  performing  interviews  but  sometimes  fail  to  recognize 
how  unnecessary  it  is  to  process  6  million  records  to  estimate  a  simple  proportion. 

Software 

The  type  of  software  used  to  perform  the  computerization  is  often  less  crucial  than 
the  skills  of  those  who  will  program  and  run  it.  Usually,  there  are  several  types  of 
data-base  or  statistical  packages  that  will  do  a  given  task  well  if  properly 
programmed.  Beware  of  the  'indispensable  programmer'  syndrome,  in  which  a  single 
expert  programmer  writes  a  system  in  his  or  her  favorite  language  and  then  departs  for 
greener  pastures,  leaving  the  users  without  resources  for  further  maintenance. 

Data-base  packages  such  as  dBase,  Paradox,  Foxbase,  and  Clipper  are  designed  to  allow 

data  input,  storage,  retrieval,  and  editing.  Most  will  count  records  but  do  not 

easily  do  such  statistics  as  odds  ratios.   They  require  a  skilled  programmer  to 
produce  a  customized  system. 

Statistics  packages,  such  as  Statistical  Analysis  System  (SAS)  and  Statistical  Package 
for  the  Social  Sciences  (SPSS),  focus  on  producing  statistical  reports,  usually  from 
single  files  of  data.   They  are  less  convenient  for  data  entry.   Both  SAS  and  SPSS  now 
have  mainframe  and  microcomputer  versions.  They  contain  many  routines  rarely  used  by 
epidemiologists  and  occupy  large  amounts  of  disk  space  (tens  of  megabytes  for  SAS) . 

Epi   Info  provides  a  combination  of  data-base  and  statistical  functions,  allowing 
relational  linking  of  several  files  during  data  entry  or  analysis.   Questionnaires  or 
forms  may  be  up  to  500  lines,  with  hundreds  of  numeric  or  text  fields,  and  the  number 
of  records  is  limited  only  by  disk  storage  space.  Frequencies,  cross  tabulations, 
customized  reports,  and  graphs  can  be  produced  through  commands  contained  in  a  program 
file  or  interactively  from  the  keyboard.  Commonly  used  epidemiologic  statistics  are 
part  of  the  statistical  output.  Although  it  takes  little  experience  to  use  Epi   Info 
for  investigating  outbreaks,  producing  a  complete  surveillance  system  from  the 
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beginning  takes  both  skill  and  time.  It  may,  however,  be  much  simpler  to  modify 
software  supplied  with  the  program. 

It  is  important  to  realize  the  limitations  of  software  packages  before  they  are  used. 
Both  statistical  and  data-base  packages  typically  cost  at  least  several  hundred 
dollars  and  therefore  are  not  likely  to  be  feasible  for  classes  of  students  or  large 
numbers  of  remote  computers. 

Some  data-base  packages  limit  the  number  of  fields  in  a  record  or  the  number  of 
records  in  a  file,  and  few  will  do  statistics  without  advanced  programming  or  purchase 
of  a  supplementary  package.   Statistics  packages,  on  the  other  hand,  may  have 
limitations  in  handling  textual  ("alpha")  data,  and  most  allow  processing  of  only  one 
file  at  a  time.  A  complete  surveillance  system  may  require  the  functions  of  both 
data-base  and  statistical  programs. 

The  current  version  of  Epi   Info   has  limitations  on  the  number  of  records  that  can  be 
sorted  or  linked  at  one  time  (tens  of  thousands) ,  however,  and  since  text  fields  are 
limited  to  80  characters,  Epi  Info  would  not  be  a  good  choice  if  large  amounts  of  text 
are  to  be  stored,  as  in  a  complete  clinical  system  containing  dictated  notes. 

Designing  Entry  Forms 

In  a  surveillance  system,  data  items  are  usually  entered  in  a  standard  format  (e.g. ,  a 
questionnaire  or  report  form) .   The  information  is  stored  in  files  containing  one 
record  per  individual.   In  Epi   Info,    the  format  of  the  data-base  file  is  specified  by 
typing  a  questionnaire  or  form  in  the  word  processor.   The  result  resembles  a  paper 
form,  with  entry  blanks  indicated  by  special  symbols  (e.g.,  underlined  characters  for 
text  fields  and  number  signs  for  numeric  fields) .  The  computer  reads  the  form  and 
constructs  a  file  in  the  proper  format. 

In  designing  a  form,  it  is  useful  to  include  a  unique  case  identifier  as  a  number  of 
combination  of  letters  and  digits.  This  may  include  meaningful  information,  such  as 
the  year,  but  should  not  include  any  item  that  may  need  to  be  changed,  such  as  a 
disease  code.   It  must  be  designed  so  that  a  new  and  unique  number  will  always  be 
available  for  each  record. 
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The  amount  of  data  entry  and  computer  storage  required  may  be  minimized  by 
computerizing  only  information  that  will  actually  be  used.   If  follow-up  information 
such  as  name,  address,  and  telephone  number  can  be  used  from  the  paper  form,  there  may 
be  no  need  to  enter  it  into  the  computer.   If  contact  tracing  is  recorded,  the 
computer  record  may  summarize  the  number  of  contacts  named  and  the  number  found  or 
treated,  with  the  details  on  each  and  progress  of  the  follow-up  efforts  relegated  to 
the  paper  forms  used  by  field  investigators.   When  including  an  item  on  the  input 
form,  it  is  helpful  to  ask,  'how  will  this  be  analyzed?"  and  "how  would  the  result 
look  after  processing?'   Computers  around  the  world  are  full  of  data  items  that 
someone  entered  "just  in  case  we  need  it."  Most  are  never  needed. 

Textual  material  can  be  printed  from  a  computer  file,  but  it  is  usually  difficult  or 
impossible  to  process  such  entries  as  "Pen,  Strep,  and  Ampicillin,"  to  produce 
meaningful  tabulations.  For  serious  analysis  a  more  usable  format  would  be 

Penicillin         <Y> 

Streptomycin       <Y> 

Ampicillin         <Y> 
in  which  "<Y>"  represents  a  blank  for  a  "Y"  or  "N°  response. 

A  common  problem  in  designing  entry  forms  is  that  several  data  items  may  be  similar. 
Suppose  you  want  to  record  name  and  treatment  (RX)  status  for  up  to  12  contacts  of 
each  case-patient.  One  possible  approach  is  to  create  fields  called  NAME1  through 
NAME12  and  RX1  through  RX12 .  This  approach  allows  the  data  to  be  entered,  although  it 
creates  a  very  large  data-entry  record  (say  12  x  22  characters  for  NAMEs  and  12  x  1 
characters  for  RX=276  characters,  even  if  no  information  about  contacts  is  entered) . 
However,  analyzing  the  information  becomes  a  programming  nightmare,  as  determining  the 
number  of  contacts  or  their  treatment  status  requires  examining  at  least  12  different 
fields  in  each  record  to  see  whether  they  have  been  filled  in  and  keeping  a  running 
tally  of  the  results.   In  computer  data-base  jargon,  the  record  is  not  "normalized." 
These  repeating  groups  of  fields  should  be  placed  in  separate  records — one  for  each 
contact--linked  to  the  main  file  as  described  below  in  the  section  on  linking  special- 
purpose  records.  Then  a  case-patient  with  one  contact  has  one  record  in  the  case  file 
and  one  record  in  the  contact  file  rather  than  the  equivalent  of  these  plus  11  empty 
records  in  a  single  file. 
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This  problem  is  resolved  by  rethinking  what  is  really  the  best  unit  around  which  to 
build  an  individual  record.  The  simple  answer  is  that  if  you  intend  to  tabulate 
cases,  build  a  case  record;  if  you  will  tabulate  contacts  or  follow-up  visits,  then 
you  need  a  contact  or  follow-up  record.   If  both  are  necessary  and  the  system  is  large 
or  permanent,  records  should  be  placed  in  separate  files  and  linked  using  relational 
data-base  features  as  described  below. 

Data  Entry 

The  details  of  data  entry  should  be  determined  and  documented,  including  who  will 
prepare  the  paper  records  (if  needed)  for  entry,  who  will  enter  them,  and  at  what 
intervals.  The  status  of  the  report  as  "suspected"  or  "confirmed"  may  determine 
whether  it  is  entered,  and  this  must  be  determined  at  the  outset.   Most  disease 
reports  are  entered  in  batches--once  a  week,  for  example--and  in  many  states  not  more 
than  an  hour  or  two  is  needed  to  enter  the  data  for  a  week,  although  the  quantity  of 
records  varies  sixfold  in  size  in  different  states  and  correspondingly  in  time 
required  to  enter  data. 

Records  linked  to  more  extensive  specialized  forms  can  be  sent  as  partial  submissions 
and  revised  later  to  avoid  delays  in  reporting  caused  by  the  slower  progress  of  data 
collection  for  the  more  detailed  forms.   This  issue  needs  to  be  considered  and 
resolved  in  advance . 

Cleaning  and  Editing  the  Data 

Errors  or  duplications  inevitably  occur  during  data  entry,  and  additional  information 
may  arrive  that  requires  changes  or  additions.  The  data  can  be  "cleaned"  during  data 
entry  or  with  the  help  of  analytic  programs  that  display  "outliers, "  and  data  can  be 
checked  visually  by  browsing  through  records  in  the  ENTER  program  or  by  scanning  a 
list  printed  by  the  ENTER  or  ANALYSIS  programs.   Records  can  be  viewed  and  corrected 
in  a  spreadsheet  format  in  ANALYSIS.  Finally,  a  program  called  VALIDATE  can  be  used 
to  compare  files  entered  in  duplicate  by  different  operators.   Records  showing 
different  entries  are  printed  out  for  reconciliation. 

Epi  Info   allows  extensive  programming  of  error  checks  on  data  entry.   Each  field  can 
be  set  to  accept  only  specified  codes,  and,  if  necessary,  multiple  fields  can  be 
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checked  for  inconsistencies  such  as  gynecologic  conditions  recorded  for  males. 
Unfortunately,  many  errors  cannot  be  caught  by  such  systems,  and  one  can  still  enter 
the  wrong  code  for  a  less  gender-specific  disease. 

Regardless  of  the  method  used,  errors  should  be  caught  and  corrected  near  the  time  of 
data  entry  if  possible,  since  they  can  create  much  larger  problems  if  left  for  the  end 
of  the  year.  The  choice  depends  largely  on  orientation  and  number  of  personnel 
available  and  perhaps  on  their  preferences  after  trying  different  methods. 

Analysis  of  Data 

The  type  of  output  desired  should  be  planned  in  advance,  since  the  inputs  and  outputs 
usually  specify  fairly  precisely  what  kind  of  processing  is  needed  to  achieve  the 
result.   Dummy  tables  and  graphs  should  be  sketched  on  paper.  Epi   Info   and  many  other 
data-base  programs  can  be  programmed  to  print  a  table  or  mixture  of  text  and  tables  in 
almost  any  format,  using  a  feature  called  the  "report  generator." 

It  is  not  necessary  to  design  reports  to  cover  all  possible  needs,  since  ad  hoc 
queries  are  an  important  part  of  any  system,  and  additional  reports  can  be  added  later 
if  they  are  deemed  useful.   In  Epi   Info,    an  epidemiologist  can  learn  to  do  simple 
queries  (READ  GEPI;  TABLES  RACE  COUNTY)  in  a  short  time  and  to  limit  these  to 
particular  time  periods  (SELECT  REPORTWK  =  34)  almost  as  easily. 

Sometimes  a  simple  report  such  as  a  listing  this  week's  reports,  sorted  by  disease, 
may  be  as  useful  as  a  number  of  tables  with  very  small  numbers  in  each  cell.  The 
number  of  records  available  should  be  considered  in  designing  reports  and  in 
determining  how  often  they  will  be  produced. 

Distributed  Data  Base 

So  far,  we  have  described  a  surveillance  system  housed  in  a  single  microcomputer.   As 
more  community  health  departments  obtain  computers,  however,  the  trend  is  toward 
networks  of  computers  within  a  state,  connected  by  modem  in  ways  analogous  to  those 
used  in  the  National  Electronic  Telecommunications  Surveillance  System  (NETSS),  with 
its  50+  state  and  territorial  participants.   Each  participating  site  enters  data  and 
sends  them  periodically  to  a  computer  at  the  next  level  up. 
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This  process  would  be  simple  to  do  if  all  data  were  entered  at  the  local  level  and 
sent  to  the  state  level,  and  if  no  changes  were  made  later.   However,  in  practice,  not 
only  are  changes  made,  but  in  some  states  records  are  entered  at  both  state  and  local 
levels,  and  some  method  must  be  in  place  to  see  that  both  levels  of  staff  eventually 
have  the  same  records. 

Ideally,  only  one  copy  of  the  records  would  be  considered  the  "master"  copy,  and  each 
user  would  know  its  location  and  provide  updates  only  at  the  designated  time.   The 
best  way  to  accomplish  this  objective  is  still  being  worked  out,  and  experiments  of 
several  types  are  likely.   Designating  only  one  of  the  sources  as  the  "owner"  and 
rightful  editor  of  the  data  is  one  possibility.  At  present,  we  favor  indicating  on 
each  record  the  site  at  which  it  was  created  and  allowing  only  that  site  to  make 
changes  that  are  transmitted  weekly  to  the  other  sites  to  update  their  copies  of  the 
records . 

State  health  departments  use  the  latest  software  to  transmit  year-to-date  summary 
information  on  the  state  data  base  to  the  national  level  each  week.   These  data  are 


compared  automatically  with  the  contents  of  the  national  data  base,  and  any 


discrepancies  are  reported. 


Transmitting  Data 


In  NETSS,  most  states  transmit  reports  each  week  through  a  commercial 
telecommunications  network.  The  50+  reports  stay  in  the  network  computer  until  they 
are  picked  up  on  Tuesday  morning  by  CDC  staff,  stripped  of  comments  and  address 
material,  and  joined  together  in  a  single  file  for  processing  on  the  CDC  mainframe. 
Error  checking  is  done  to  test  for  invalid  codes  and  other  problems,  and  error  notices 
are  sent  back  to  the  states. 

Another  method  that  eliminates  errors  caused  by  telephone  noise  involves  transmission 
directly  from  computer  to  computer  by  means  of  modems  and  software  that  retransmits  if 
errors  are  caused  by  noise.   Several  states  are  using  this  method  to  connect  with  CDC 
microcomputers  that,  in  turn,  send  the  files  to  the  CDC  mainframe. 

A  third  less  elegant  but  often  practical  solution  is  physical  transfer  of  floppy 
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diskettes  by  mail  or  messenger  at  intervals.  This  allows  large  files  to  be 
transferred  with  minimal  inconvenience,  and  may  be  appropriate  if  the  additional 
trouble  of  setting  up  modems  and  software  is  not  yet  warranted  or  in  developing 
countries  where  telephones  are  unreliable  or  unavailable. 

In  any  case,  the  result  is  that  a  copy  of  a  file  of  records  from  the  peripheral  site 
arrives  at  the  central  site.   The  records  must  then  be  merged  into  the  main  data  base. 
If  all  are  new  records,  this  task  is  straightforward.   If  the  incoming  records  contain 
updates  for  records  previously  transmitted,  the  process  is  more  complex. 

Correcting  and  Updating  Records  from  Another  Site 

In  NETSS,  only  state  participants  are  allowed  to  update  records;  CDC  staff  do  not  do 
so,  although  they  may  enter  temporary  telephone  reports.  Updates  are  sent  as  records 
with  the  same  identification  number  as  that  for  the  original  record.   If  a  new  record 
has  the  same  identification  number  as  a  record  in  the  data  base,  the  existing  record 
is  updated  so  that  all  non-blank  fields  of  the  new  record  prevail.  To  change  an  age, 
for  example,  a  state  would  send  a  record  containing  the  case  identification  number  and 
the  new  age.   To  delete  a  record,  the  state,  year,  and  identification  numbers  are  sent 
in  a  special  'Delete'  record.   When  errors  are  found  at  CDC,  the  information  is 
transmitted  to  the  state  staff,  who  then  corrects  the  errors  and  transmit  update 
records  the  following  week. 

Individual  and  Summary  Records 

Many  systems  function  with  a  record  for  each  individual  case  report.   In  some, 
however,  there  is  a  need  for  summary  records,  each  of  which  represents  a  number  of 
case  reports.   This  is  helpful  if  large  numbers  of  similar  records  (e.g.,  cases  of 
gonorrhea  in  a  big  city)  are  processed,  or  if  only  summary  numbers  are  available.   It 
also  allows  records  from  entire  years  to  be  summarized  in  condensed  format,  so  that  a 
5-year  trend  can  be  calculated  without  reading  and  processing  each  record  for  the 
previous  5  years. 

A  summary  record  is  similar  to  a  case  record,  but  it  contains  an  additional  field 
called  'COUNT,'  which  contains  a  number.   The  number  indicates  how  many  records  with 
the  same  information  are  represented  by  the  summary  record.  Epi  Info   contains 
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commands  called  SUMTABLES  and  SUMFREQ  to  process  summary  records.   These  commands  sum 
the  contents  of  the  count  field  rather  than  counting  individual  records.   Since  a 
record  with  COUNT  equal  to  1  is  an  individual  case  record,  files  that  are  mixtures  of 
summary  and  individual  records  can  be  processed  as  a  single  unit. 

Linking  Special-Purpose  Records  to  the  Main  Data  Base 

As  mentioned  above,  sometimes  it  is  necessary  to  link  related  records  in  different 
files  together  in  order  to  allow  easy  processing  of,  for  example  case-patients  and 
contacts  who  are  related  to  case-patients.  This  requires  that  a  common  case 
identification  number  be  included  in  each  record.  Epi   Info   and  other  data-base 
programs,  such  as  dBASE,  allow  automatic  linking  of  records  through  such  a  common 
identifier.   On  data  entry,  answering  "Y"  to  the  question  'Contacts  (Y/N) ?"  might 
cause  another  form,  representing  the  contact  file,  to  appear  on  the  screen.   The 
operator  can  then  enter  one  or  many  contact  forms  for  this  case,  pressing  a  function 
key  (F10)  to  return  to  the  main  form.   A  separate  record  is  created  for  each  contact. 

In  Epi    Info's   ANALYSIS  program,  the  CONTACT  file  is  READ,  and  the  CASE  file  is  linked 
("related")  to  it.   Each  contact  record  then  contains  information  about  the  case- 
patient  as  well  as  about  the  contact,  and  questions  such  as  "how  many  contacts  of 
female  case-patients  were  treated?"  can  be  answered  easily.   The  CASE  file  can  also  be 
processed  alone  to  answer  questions  such  as  "how  many  cases  of  syphilis  were  there?" 

We  also  link  disease-specific  forms  to  the  main  data  base  of  reports.   Hepatitis,  for 
example,  requires  a  full  page  of  extra  information  used  to  define  further  the 
epidemiology  of  a  report.  By  linking  a  hepatitis  file  to  the  main  case  file,  records 
are  created  only  if  the  disease  is  hepatitis,  thus  saving  a  great  deal  of  storage 
space  over  the  single-file  method,  in  which  all  the  questions  on  hepatitis-  would  be 
left  blank  in  a  nonhepatitis  record.  Current  systems,  including  the  one  distributed 
as  an  example  on  the  Epi   Info   disks,  contain  related  files  for  hepatitis,  meningitis, 
and  enteric  disease,  each  of  which  only  appears  if  a  relevant  disease  code  is  entered. 

Dissemination  of  Data 

Dissemination  of  results  is  an  important  element  of  the  surveillance  cycle. 
Computerization  can  assist  by  making  new  methods  of  analysis  or  presentation 


265 

practical.  Use  of  tabular  or  graphics  software  in  conjunction  with  desk-top 
publishing  technology  can  make  the  preparation  of  results  not  only  faster  but  more 
accurate  and  meaningful.   A  graphic  method  for  comparison  of  current  results  with 
those  for  the  past  5  years  has  been  introduced  to  the  Morbidity  and  Mortality  Weekly 
Report   in  the  United  States  (Figure  V.12)  (14).   This  method  would  have  been  too 
cumbersome  for  manual  processing. 

Computer  software  greatly  simplifies  and  improves  the  production  of  maps  and  graphs. 
Epi  Map,    a  public  domain  companion  to  Epi   Info,    to  be  released  in  1993  will  make 
mapping  available  to  anyone  with  an  IBM-compatible  microcomputer. 

Tables,  maps,  graphs,  text,  and  data  files  may  be  made  available  either  on-line  via 
modem  connections  or  by  distributing  floppy  or  CD-ROM  disks.  The  latter  are 
particularly  useful  in  remote  areas  or  for  large  volumes  of  data  than  can  be  easily 
sent  over  low-speed  modems. 

Data  Disasters 

Destruction  or  damage  or  data  on  hard  disks  should  be  expected  and  planned  for. 
During  the  first  4  years  of  NETSS  (and  during  the  3  year  tenure  of  its  predecessor, 
the  Epidemiologic  Surveillance  Project),  a  number  of  hard  disks  have  "crashed."   In 
most  cases,  back-up  files  on  floppy  diskettes  had  been  properly  prepared  and  stored, 
and  they  were  used  to  restore  the  data  once  the  disk  had  been  replaced. 

Recently,  some  state  programs  began  to  reuse  case-identification  numbers  from  several 
years  ago,  not  realizing  that  the  new  records  would  overwrite  the  old  records  in  the 
national  data  base.   It  is  important  to  be  clear  about  the  time  period  for  which 
updates  will  be  accepted. 

Upgrading  either  hardware  or  software  is  a  frequent  cause  of  problems,  when  the  new 
items  have  unexpected  features,  occupy  more  memory  space,  or  require  that  protocols 
for  functions,  such  as  communications,  be  changed. 

Computer  viruses  are  an  increasing  cause  of  problems.   They  can  cause  a  variety  of 
difficulties  ranging  from  erratic  behavior  of  software  to  complete  loss  of  files. 
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They  may  be  introduced  from  networks,  by  accessing  other  computer  bulletin  boards,  or 
by  loading  copied  software  from  unknown  sources. 

Programs  to  detect  and  eradicate  computer  viruses  are  available  commercially.   It  is 
essential  to  install  one  of  these  and  to  be  sure  that  any  disk  from  an  external  source 
is  scanned  for  viruses  before  it  is  copied  or  used  as  a  source  of  new  programs. 

Backup  Methods 

Methods  for  disaster  prevention  center  around  regular  backup  of  data  files  onto  floppy 
diskettes  (or  tape  if  available,  but  beware  of  tape  backups  with  only  one  compatible 
tape  drive  in  the  same  institution) .  The  back  up  copies  should  be  rotated  so  that 
several  circulate  in  turn  and  so  that  the  one  overwritten  has  at  least  two  more  recent 
relatives.  To  protect  against  fire,  water  damage,  and  damage  by  panic-stricken 
personnel,  it  is  wise  to  keep  at  least  one  backup  in  a  site  remote  from  the  computer. 
Setting  the  write-protection  feature  on  the  diskettes  after  making  the  backup  is  an 
additional  protection. 

Upgrading  hardware  or  software  should  be  done  at  a  time  when  use  of  the  system  is 
least  critical,  and  care  should  be  taken  to  allow  for  replacing  the  old  system  exactly 
as  it  was  if  problems  occur  with  the  new  one.   Thus,  before  installing  a  new  version 
of  software,  the  old  one  should  be  thoroughly  backed  up  or  preferably  left  in  place  in 
another  directory  so  that  it  can  be  used  if  necessary. 


Training  of  Staff  and  Transition  Techniques 

We  have  found  that  the  most  effective  staff  training  occurs  by  having  potential 
operators  participate  in  the  design  of  the  system  and  receive  short  demonstrations  and 
hands-on  lessons  at  the  time  the  system  is  installed.   Usually  installation  of  a 
system  takes  two  or  three  days  for  planning  and  decision  making,  two  or  three  days  for 
programming,  and  a  similar  period  for  staff  training,  trial  runs,  and  revisions. 

National  meetings  and  training  sessions  for  operators  of  state  surveillance  systems 
have  been  helpful  in  providing  extra  training  and  motivation  and  in  surfacing  problems 
that  need  to  be  addressed  and  new  ideas  for  software  improvements. 

During  the  transition  from  a  paper  to  a  computerized  system,  both  systems  are  run  in 
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parallel  for  a  period  until  the  results  are  satisfactory  and  staff  feel  comfortable 
with  the  new  system. 

DISCUSSION 

The  old  image  of  the  computer  expert  in  an  expensive  suit  handing  the  client  the  keys 
to  the  new  "turn-key"  system  perfectly  adapted  to  his  or  her  needs  was  probably  always 
a  fantasy,  but  with  modest  budgets,  small  data  bases,  and  a  desire  for  "hands-on" 
access  to  data,  it  certainly  has  little  relevance  to  public  health  needs.  Although  in 
some  ways  centralized  computers  and  instant  interactivity  for  updating  records  would 
present  fewer  problems  than  the  distributed  systems  we  have  described,  public  health 
workers  usually  do  not  require  and  cannot  financially  afford  the  instant  updates 
needed  for  law  enforcement,  banking,  or  airline  reservations.  Microcomputers  and 
local  data  bases  can  maintain  the  data  and  analytic  results  closer  to  the 
professionals  primarily  responsible  for  prevention  and  control. 

We  are  convinced  that  participation  of  all  50  state  health  departments  in  the  national 
computerized  system  would  have  been  impossible  without  a)  software  for  states  that 
allowed  customization  for  use  of  local  forms  and  procedures,  b)  participation  of  each 
state  epidemiologist's  staff  in  designing  a  system  unique  to  the  state,  and  c)  a 
standardized  record  format.  Each  state  has  a  different  input  form,  although  the 
records  sent  to  CDC  are  restructured  and  variable  values  are  recoded  by  Epi   Info 
programs  so  that  they  are  in  the  uniform  national  format. 

As  systems  become  more  complex,  however,  it  is  important  to  standardize  as  many 
features  as  possible  from  state  to  state  so  that  a  thoroughly  debugged  core  system  can 
be  used  by  all.  We  are  gradually  achieving  this  with  a  new  Epi-Info   based  system  that 
has  a  series  of  standard  modules,  accompanied  by  other  modules  that  are  highly 
customizable. 

As  pointed  out  in  this  chapter,  there  is  an  enormous  gap  between  what  is 
technologically  possible  with  the  use  of  computers  in  public  health  and  what  is 
actually  going  on  at  the  grass-roots  level  of  public  health  practice.  Until  the 
keeping  of  medical  records  in  clinical  practice  is  computerized  to  a  much  greater 
extent,  it  would  be  difficult  to  imagine  that  our  scenario  of  the  future  will  actually 
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move  closer  to  reality. 

Other  key  issues  remaining  to  be  resolved  include  a)  the  balance  between 
confidentiality  and  free  access  to  clinical  records  for  public  health  purposes,  b)  the 
cost  of  data  access  and  of  programming  and  processing,  and  c)  the  ability  of  both 
professionals  and  the  public  to  deal  with  "dirty"  and  preliminary  data. 

Many  of  these  issues  have  both  technical  and  social  solutions.  A  great  deal  of  work 
in  both  realms  remains  to  be  done  before  computerized  public  health  surveillance  can 
be  said  to  have  achieved  its  full  potential. 
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Chapter  XII 


State  and  Local   Issues   in  Surveillance 


Melinda  Wharton 
Richard  L.  Vogt 


"The  government  is  very  keen  on  amassing  statistics.  They  collect  them,  add  them, 
raise  them  to  the  nth  power,  take  the  cube  root  and  prepare  wonderful  diagrams.  But 
you  must  never  forget  that  every  one  of  these  figures  comes  in  the  first  instance  from 
the  village  watchman,  who  just  puts  down  what  he  damn  well  pleases.' 

Josiah  Stamp 


INTRODUCTION 

In  a  recent  report,  the  Institute  of  Medicine  defined  assessment  as  a  core  function  of 
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public  health  agencies  at  the  state  and  local  level.   "An  understanding  of  the 
determinants  of  health  and  the  nature  and  extent  of  community  need  is  a  fundamental 
prerequisite  to  sound  decision-making  about  health.   Accurate  information  serves  the 
interests  both  of  justice  and  the  efficient  use  of  available  resources.   Assessment  is 
therefore  a  core  governmental  obligation  in  public  health."  State  responsibilities 
include  "assessment  of  health  needs  within  the  state  based  on  statewide  data  collec- 
tion" as  well  as  "establishment  of  statewide  health  objectives,  delegating  power  to 
localities  and  holding  them  accountable."   Responsibilities  of  local  public  health 
units  include  "assessment,  monitoring,  and  surveillance  of  local  health  problems  and 
needs  and  resources  for  dealing  with  them"  (2) . 

AUTHORITY  FOR  REPORTING  SURVEILLANCE  DATA 

Although  much  of  this  book  focuses  on  surveillance  at  the  national  level,  the  legal 
and  regulatory  authority  for  public  health  surveillance  activities  in  the  United 
States  derives  from  state  and  local  law  (see  Chapter  X) .   Both  the  vital  records  and 
morbidity  reporting  systems  were  developed  initially  at  the  state  level,  and  only 
later  were  national  systems  developed,  with  the  participation  of  all  states  being 
voluntary.   Indeed,  in  the  United  States,  state  and  local  governments  have  both  the 
authority  and  the  responsibility  for  almost  all  public  health  actions.  This  decen- 
tralization of  power  is  outlined  in  the  Constitution  of  the  United  States.  Therefore, 
although  most  of  the  issues  discussed  in  this  chapter  are  relevant  to  other  countries, 
some  are  unique  to  the  practice  of  surveillance  in  the  United  States. 

Although  the  objectives  of  surveillance  at  the  state  and  local  level  do  not  differ 
substantially  from  those  at  the  national  level,  the  link  to  act ion- -whether  it  be 
outbreak  control,  vector-control  activities,  legislation  requiring  use  of  child- 
restraint  devices,  or  community  mobilization--is  most  explicit  at  the  state  and  local 
level.  The  objectives  of  state  as  well  as  national  surveillance  must  be  considered  as 
systems  are  developed  or  redesigned,  to  assure  that  the  information  needed  for  public 
health  action  is  obtained  in  the  most  efficient  and  cost-effective  manner.   The  focus 
of  the  objectives  may  vary  somewhat  by  condition  (see  Chapters  I  and  II) . 

SOURCES  OF  SURVEILLANCE  DATA 
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Only  two  data  sources--vital  records  and  notifiable-disease  reports--are  available  at 
the  local  level  in  all  states  in  the  United  States.   Although  other  data  sources 
discussed  in  Chapter  III  may  be  available  at  the  state  and  local  levels  in  some  areas, 
alternate  data  sources  may  be  needed  in  some  states  or  localities  to  assess  the  impact 
of  specific  public  health  problems.   Innovative  solutions  to  particular  data-related 
problems  have  been  developed  in  many  communities;  some  issues  related  to  data  sources 
at  the  state  and  local  level  are  summarized  below.   For  more  information  regarding 
other  data  sources,  see  Chapter  III. 

Notifiable  Diseases 

All  50  states  require  that  physicians  report  cases  of  specified  notifiable  diseases  to 
the  appropriate  state  or  local  health  department.  The  legal  authority  for  the 
collection  of  this  information  rests  with  state  statutes  that  are  promulgated  in  state 
regulation;  the  diseases  that  are  reportable  vary  by  state  (2,3).  The  notifiable- 
diseases  reporting  system  was  initially  developed  for  reporting  epidemic  diseases  such 
as  smallpox  and  yellow  fever,  and  this  mechanism  is  still  most  commonly  used  for 
surveillance  of  infectious  diseases.  For  noninfectious  conditions,  reporting  by 
physicians  is  less  uniformly  required.   In  many  states,  however,  reporting  of  specific 
occupational  or  chronic  diseases  is  required  by  statute. 

Sentinel  Systems 

State  and  local  health  departments  may  supplement  information  available  through  the 
notifiable-disease  reporting  system  by  creating  sentinel  reporting  systems.   State- 
based  sentinel  systems  in  Maine  and  Rhode  Island  relied  on  reporting  by  physicians, 
who  were  recruited  by  the  state  health  department  and  were  paid  small  amounts  of  money 
for  participation.  Both  systems  were  subsequently  discontinued  because  of  budgetary 
cutbacks  (4,5) . 

More  recently,  a  sentinel  active  surveillance  system  developed  in  Missouri  has  been 
organized  to  ensure  representation  of  the  six  public  health  districts  in  the  state. 
Over  500  sites  were  recruited  for  participation,  including  schools,  hospitals,  day- 
care centers,  preschools,  and  nursing  homes;  fewer  than  30%  of  the  participating  indi- 
viduals or  institutions  were  physicians  or  clinics.   Each  participating  site  is 
telephoned  weekly  by  local  health  departments  to  solicit  reports  (f) .      A  similar 
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system,  including  universities,  has  been  operated  by  the  Los  Angeles  County  Department 
of  Health  Services  since  1981.  In  addition  to  providing  timely  information  about 
reportable  diseases,  the  system  also  has  provided  data  on  a  variety  of  nonreportable 
conditions  (7) . 

Such  sentinel  systems  may  be  particularly  useful  for  following  trends  in  common  condi- 
tions— e.g.,  varicella  or  influenza- -when  precise  counts  of  cases  are  not  needed  and 
when  a  public  health  response  is  not  necessary  for  individual  case  reports.  However, 
if  the  reporting  units  selected  for  the  sentinel  system  are  unrepresentative  of  the 
overall  reporting  population,  findings  may  not  be  generalizable  to  the  wider  popula- 
tion. Sentinel  surveillance  systems  may  be  used  to  facilitate  collection  of  addition- 
al risk-factor  and  other  information  on  a  subset  of  case  reports,  thus  limiting  the 
overall  burden  of  data  collection  (8) . 

Hospital -Based  Surveillance 

Hospital-based  surveillance  systems,  drawing  on  emergency  room  visits  or  hospital- 
discharge  data,  have  most  commonly  been  developed  at  the  state  and  local  level  for 
surveillance  of  injuries  (9-25) .  Other  uses  have  included  assessment  of  unmet  health 
needs  by  identification  of  preventable  disease  (sentinel  health  events)  (16).     Aside 
from  nosocomial  infections,  such  systems  are  likely  to  have  limited  usefulness  for 
surveillance  for  communicable  disease  (17)  . 

In  areas  in  which  hospital-discharge  diagnoses  are  coded  using  external  cause  of 
injury  and  poisoning  codes  (E-codes),  hospital-discharge  data  are  useful  for  surveil- 
lance of  injuries.  Currently  28  states  have  uniform  hospital-discharge  reporting 
systems,  and  addition  of  E-coding  is  a  high  priority  for  state  and  local  injury- 
surveillance  programs  (18)  .     The  recent  experience  of  New  York  State  demonstrated  the 
feasibility  of  such  an  addition,  particularly  when  care  was  taken  to  develop  a 
constituency  to  support  the  proposed  change.  Review  of  clinical  records  demonstrated 
that  93%  of  charts  contained  information  necessary  to  allow  proper  coding.  Since  E- 
coding  has  begun,  95%  of  records  of  injured  persons  contain  a  valid  E-code  (19)  . 

Other  hospital-based  data  sources  may  be  useful  for  surveillance  at  the  state  and 
local  level.   For  example,  trauma  registries  are  a  potential  source  of  data  for  injury 
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surveillance    (20) ,   despite  the  lack  of  representativeness  of  patients  referred  to 
trauma  centers   for  care    (21) . 

School -Based  Surveillance 


School-based  surveillance  systems  have  been  developed  in  some  states  to  monitor 
disease  trends  among  children  of  school  age.   This  approach  has  been  used  for  surveil- 
lance of  influenza  and  varicella  (22,23).      Absenteeism  is  an  excellent  marker  for 
influenza  and  is  almost  always  available  for  administrative  reasons.   In  Michigan, 
schools  provide  reports  of  cases  of  notifiable  diseases  among  their  students--along 
with  counts  of  number  of  cases  of  influenza-like  illness  and  varicella--to  local 
health  departments  on  a  weekly  basis.   In  many  states,  notifiable-disease  regulations 
mandate  reporting  of  specified  diseases  by  school  authorities. 

Surveys  at  the  State  and  Local  Level 

Information  on  certain  issues,  such  as  seat-belt  use  or  nonutilization  of  health-care 
services,  cannot  be  obtained  readily  without  the  use  of  surveys.  Although  national 
surveys  may  provide  national  estimates,  data  at  the  state  or  even  local  level  are 
needed  for  health  planning  or  to  support  legislative  initiatives.   Since  1981,  state 
health  departments  have  collaborated  with  the  Centers  for  Disease  Control  (CDC)  to 
conduct  telephone  surveys  of  adults  to  obtain  information  on  health  practices  and 
behavior.   In  1990,  45  states  and  the  District  of  Columbia  participated  in  the 
Behavioral  Risk  Factor  Surveillance  System  (BRFSS) .   The  BRFSS  allows  estimation  of 
age-  and  gender-specific  prevalence  of  various  risk  factors  by  state  (24,25). 
Likewise,  behavioral  risk  factors  among  young  people  are  periodically  measured  through 
state  and  local  school-based  surveys  in  the  Youth  Risk  Behavior  Surveillance  System 
(26).     County  or  community  surveys  may  be  particularly  useful  in  areas  with  small 
populations,  in  instances  in  which  morbidity  or  mortality  data  may  be  of  limited 
usefulness  to  monitor  the  impact  of  interventions  (27) . 


National  Mortality  Registration  System 

State  law  requires   filing  a  death  certificate  for  every  death  that  occurs  in  the 
state,    and  death  registration  is  virtually  complete  in  the  United  States.     At  the 
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state  level,  mortality  data  are  available  before  national  data  are  compiled  and 
released.  Although  the  underlying  cause  of  death  is  determined  using  standard 
computerized  algorithms  in  all  states,  not  all  states  use  E-coding. 

Such  data  are  useful  at  the  local  level  to  identify  preventable  mortality  and  to  set 
health  priorities  in  the  community.  These  efforts  may  be  particularly  important  in 
developing  community-based  prevention  programs  for  chronic  disease  (28)  . 

Other  Data  Sources 

Surveillance  responsibilities  of  state  and  local  health  departments  extend  into  many 
other  areas,  and  in  some  jurisdictions  may  include  monitoring  of  environmental 
quality,  illnesses  of  domestic  and  wild  animals,  and  vector  populations.   Although 
outside  the  scope  of  this  book,  these  types  of  surveillance  provide  important  informa- 
tion at  the  state  and  local  level.  For  example,  management  of  persons  exposed  to 
possibly  rabid  animals  is  influenced  by  the  epidemiology  of  rabies  in  the  area  of 
exposure  (29)  . 

Arbovirus  surveillance  includes  monitoring  of  vectors,  vertebrate  hosts,  human  cases, 
weather,  and  other  factors  in  order  to  detect  or  predict  changes  in  the  transmission 
dynamics  of  arboviral  infections.   Guidelines  for  arbovirus  surveillance  programs  in 
the  United  States  have  recently  been  developed  (30) . 

Provider-Based  Reporting:  Special  Issues 

Mandatory  reporting  of  communicable  diseases  by  physicians  has  a  long  history  in  the 
United  States,  and  there  is  an  equally  long  history  of  failure  on  the  part  of  physi- 
cians to  comply.   During  the  yellow  fever  epidemic  of  1795,  the  New  York  City  Health 
Committee  quarantined  patients  with  yellow  fever  at  Bellevue  Hospital.   Many  physi- 
cians refused  to  report  cases,  and  the  New  York  Medical  Society  went  on  record  oppos- 
ing the  Committee's  action,  on  grounds  that  the  disease  was  not  contagious  (31)  . 
Physicians  fought  early  efforts  to  make  tuberculosis  reportable,  arguing  that  compul- 
sory reporting  constituted  an  invasion  of  the  doctor-patient  relationship  and  a 
violation  of  confidentiality  (32).     By  1913,  five  states  had  enacted  regulations 
requiring  reporting  of  venereal  disease.  Dr.  Herman  Biggs,  director  of  the  New  York 
City  Board  of  Health,  stated  that  "the  ten  year  long  opposition  to  the  reporting  of 
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tuberculosis  will  doubtless  appear  a  mild  breeze  compared  with  the  stormy  protest 
against  the  sanitary  surveillance  of  the  venereal  diseases"  (33) . 

The  completeness  of  reporting  of  communicable  diseases  is  variable,  but  for  most 
diseases  in  most  locations,  it  is  thought  to  range  from  low  to  very  low  {34,35)  .     Of 
course,  factors  other  than  the  failure  of  physicians  to  report  cases  contribute  to  the 
low  level  of  reporting  of  incident  cases.   Persons  with  asymptomatic  infections  or 
mild  disease  are  unlikely  to  seek  medical  care.   Of  those  persons  who  do  seek  care, 
not  all  will  receive  a  specific  diagnosis.   Nationally,  only  5%  of  cases  of  varicella 
are  reported  in  the  United  States  (36) ,    and  estimates  of  completeness  of  reporting  are 
similar  for  shigellosis  (3  7) .  Studies  of  outpatient-based  or  hospital-based  reporting 
in  some  areas  suggest  somewhat  higher  levels  of  reporting  of  diagnosed  cases  of 
notifiable  diseases,  with  substantial  variation  by  disease  (38-40).      Reporting  rates 
are  higher  for  inpatients  than  outpatients  (17). 

Given  the  historic  reluctance  of  physicians  to  participate  in  reporting  disease,  it  is 
fortunate  that  reports  of  disease  are  available  to  most  state  health  departments  from 
other  sources.  Almost  all  states  mandate  reporting  by  clinical  laboratories  of  at 
least  some  notifiable  diseases  (41)  .      Laboratory  reporting  is  often  more  readily 
available  and  reliable  than  reports  from  physicians.  In  Vermont,  71%  of  initial 
reports  of  confirmed  cases  of  notifiable  diseases  in  the  period  1986-1987  originated 
from  clinical  laboratories;  only  10%  originated  from  physicians'  offices  (42)  .      In 
Oklahoma,  approximately  85%  of  cases  of  shigellosis  are  reported,  but  laboratories 
account  for  almost  all  of  the  reports  received.  Laboratories  reported  77%  of  all 
reported  cases,  compared  with  only  6%  for  physicians  (43) . 

Although  laboratory -based  reporting  may  be  a  valuable  adjunct  to  physician-based 
reporting,  it  cannot  replace  reporting  by  physicians  for  all  diseases.  Some  report- 
able diseases  are  clinical  syndromes,  requiring  clinical  judgment,  and  no  specific 
laboratory  diagnostic  procedures  exist  (44)  .      In  other  situations,  laboratory  diagno- 
sis may  play  an  important  role,  but  may  not  be  routinely  available  in  a  timely  enough 
manner  to  replace  reporting  by  physicians.  Finally,  physicians  may  have  additional 
information  that  is  epidemiologically  important  but  is  not  known  to  the  laboratory;  a 
timely  report  by  a  physician  may  allow  early  institution  of  control  measures,  without 
waiting  for  the  health  department  to  follow  up  on  laboratory  reports. 
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A  number  of  studies  have  attempted  to  identify  reasons  for  physicians'  failure  to 
report  notifiable  diseases  (42, 45-47) .      In  recent  years,  physicians  have  cited  many  of 
the  same  objections  that  have  been  raised  historically,  as  noted  above,  although  it  is 
at  least  reassuring  that  the  noncontagiousness  of  diseases  that  are  actually  communi- 
cable is  no  longer  invoked.  Commonly  cited  reasons,  in  approximate  order  of  impor- 
tance, are  summarized  in  Table  XII. 1. 

In  an  effort  to  improve  reporting  of  notifiable  diseases  by  physicians,  local  and 
state  health  departments  have  tried  a  number  of  different  strategies.  Although  many 
of  them  have  not  been  formally  evaluated,  enough  information  is  available  to  reach 
some  conclusions  about  possible  successful  approaches. 

Projects  aimed  at  improving  reporting  by  physicians  have  included  many  interventions 
(e.g.,  revised  reporting  procedures,  improved  dissemination  of  findings  and  feedback 
to  participants,  and  informational  campaigns  regarding  the  importance  of  reporting  and 
outlining  procedures  for  reporting) .  Even  relatively  intensive  efforts  may  not 
produce  major  increases  in  reporting,  although  they  may  be  effective  in  increasing 
awareness  of  reporting  procedures  among  physicians  (7,48). 

Efforts  to  increase  reporting  through  specific  projects  provide  some  clues  on  the  most 
effective  approaches.  Active  surveillance  projects,  in  which  health  department 
personnel  contact  physicians'  offices  on  a  regular  basis,  have  demonstrated  2-  to  5- 
fold  increases  in  the  reporting  of  specified  diseases,  as  well  as  increases  in 
reporting  of  other  conditions  not  subject  to  active  surveillance  (49-51) .     The 
consistency  of  these  findings  demonstrates  that  under  some  circumstances  physicians 
are  willing  to  report  cases  of  notifiable  disease.   In  these  studies,  reporting  was  a 
simple  matter,  and  that  may  be  important;  equally  important  may  be  the  message 
conveyed  by  the  substantial  investment  by  the  health  department  in  active  surveil- 
lance—that disease  reporting  is  an  important  activity. 

The  need  for  surveillance  data  on  notifiable  disease  and  the  usefulness  of  such  data 
are  so  obvious  to  workers  in  state  and  local  health  departments  that  we  often  believe 
that  all  physicians  would  report  if  they  only  understood  the  importance  of  reporting. 
Efforts  to  educate  physicians  have  included  a)  lectures  to  medical  students,  house 
officers,  and  local  medical  groups  on  the  importance  of  reporting;  b)  health  depart- 
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merit  newsletters;  c)  educational  mailings;  and  conjunction  with  licensure.  Although 
all  of  these  may  be  useful,  and  lectures  and  newsletters  are  important  forms  of 
feedback  to  the  medical  community,  evaluation  of  single  presentations  to  clinical 
groups,  newsletters,  and  mailings  have  not  been  found,  in  isolation,  to  increase 
reporting.   Intensive  efforts  to  market  the  concept  of  reporting  may  be  more  useful 
but  will  be  accompanied  by  an  obvious  increase  in  cost  (52)  . 

If  sending  an  occasional  speaker  to  the  local  medical  society  and  mass  mailings  are 
not  effective,  what  is?  The  active  surveillance  projects  and  other  studies  of 
interventions  demonstrate  the  usefulness  of  telephone  contact  (49-51, 53) .      In  fact, 
the  efforts  that  work  all  target  individual  physicians- -rather  than  groups  of  physi- 
cians— and  make  limited  use  of  mailings  and  more  use  of  personal  visits  and  telephone 
contact.   Some  approaches  that  appear  to  be  successful  include  a)  providing  physicians 
with  feedback  on  the  health  department's  disposition  of  individual  cases  (54);   b) 
matching  laboratory  reports  with  physicians'  reports,  and  for  those  cases  reported 
only  by  laboratories,  notifying  physicians  that  a  specific  case  should  have  been 
reported  to  the  health  department;  and  c)  conducting  in-person  site  visits  to  review 
reporting  procedures  (55)  .     The  latter  intervention  may  be  quite  effective  in  enhanc- 
ing laboratory-  and  hospital-based  reporting,  especially  if  accompanied  by  a  review  of 
medical  records.  The  relevant  factors  may  be  less  the  mode  of  contact  than  the  need 
to  remind  physicians  on  a  regular  basis  that  there  is  a  health  department  that  wants 
the  information  and  that  the  health  department  actually  does  something  with  the  data 
that  are  provided. 

Exhortation  and  pleading  for  reports  is  no  substitute  for  a  state  or  local  health 
department  that  responds  promptly  to  reported  public  health  problems,  provides  useful 
responses  to  inquiries  from  physicians  and  the  public,  and  gives  feedback  on  its 
activities  and  on  the  health  status  of  the  community  to  the  medical  community  and  the 
public.   Nonetheless,  a  few  specific  steps  that  state  and  local  health  departments  can 
take  to  improve  reporting  of  notifiable  diseases  can  be  identified  (Table  XII. 2). 

Active  surveillance  works,  but  it  is  generally  too  costly  to  maintain  as  a  routine 
health  department  activity.  Less  costly  alternatives  include  sentinel  active  surveil- 
lance, in  which  certain  physicians  and  institutions  are  identified  and  are  targeted 
for  active  surveillance.  Although  this  approach  has  been  successful  in  some  areas,  it 
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is  also  costly  and  may  detract  from  collection  of  surveillance  data  from  non-sentinel 
sites.  Another  approach  is  what  has  been  called  "stimulated  passive  surveillance,"  in 
which  the  health  department  uses  any  contact  with  the  medical  community  to  solicit 
reports  and  provide  feedback  on  community  health  status  and  health  department  activi- 
ties.  It  may  not  be  feasible  to  contact  every  physician,  or  even  a  systematic  sample 
of  physicians,  every  week,  but  every  week  physicians  are  contacted,  for  a  variety  of 
purposes,  and  those  contacts  can  be  used  to  exchange  information. 

Administrative  barriers  to  reporting  should  be  identified  and  eliminated.   Physicians 
should  be  provided  readable  and  up-to-date  copies  of  lists  of  notifiable  diseases, 
reporting  forms,  and  telephone  and  facsimile  numbers  for  local  and  state  health 
departments.   Reporting  procedures  should  be  as  simple  as  possible.   Some  health 
departments  have  used  toll-free  numbers  for  telephone  reporting  {46,56).     Answering 
machines  can  answer  telephones  at  night,  but  people  can  answer  questions  and  provide- - 
and  solicit--additional  information.  Reporting  forms  should  be  simple,  clear,  and 
printed  in  colors  that  allow  photocopying  or  transmission  by  facsimile  machine.   Self- 
addressed,  postage-paid  cards  or  envelopes  may  be  helpful.  Although  these  tools  may 
make  reporting  easier,  without  the  other  components  of  effective  surveillance  they  are 
unlikely  to  have  substantial  impact  on  reporting  behavior  of  physicians. 

State  licensing  boards  may  penalize  physicians  for  failing  to  report,  although  such 
actions  are  rarely  taken.   In  California,  a  physician  who  failed  to  report  on  a 
patient  with  hepatitis  A  who  subsequently  transmitted  infection  to  others  had  his 
license  suspended  for  a  year,  and  was  placed  on  probation  for  5  years  (57)  .     The 
medicolegal  implications  of  failure  to  report  are  well-established  in  law,  where  the 
physician's  obligation  has  been  found  to  extend  beyond  the  patient  under  his/her  care 
(58) .     Although  no  single  approach--be  it  improved  communications,  improved  proce- 
dures, education,  or  fear--is  necessarily  successful  in  improving  reporting  by 
physicians,  effective  presentations  have  been  developed  using  case  studies  that 
include  the  medicolegal  implications  of  failure  to  report  (Hendricks  K,  personal 
communication) . 

MAINTENANCE  OF  A  LIST  OF  NOTIFIABLE  DISEASES 

Although  the  mechanisms  vary,  it  is  important  that  lists  of  notifiable  diseases 
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undergo  periodic  revision.   Public  health  priorities,  epidemiology  of  specific 
conditions,  and  available  public  health  interventions  all  change  over  time,  with  the 
result  that  last  year's  list  of  notifiable  diseases  no  longer  meets  this  year's  needs. 
Additions  and  deletions  must  be  made  on  an  as-needed  basis  in  order  to  maintain  the 
usefulness  of  a  notifiable-disease  system.  In  particular,  care  must  be  exercised  to 
assure  that  data  on  all  notifiable  conditions  are  actually  needed  and  are  used  for 
public  health  purposes.   "Diseases  are  often  made  reportable  but  the  information 
gathered  is  put  to  no  practical  use,  and  with  no  feed-back  to  those  who  provided  the 
data.  This  leads  to  deterioration  in  the  general  level  of  reporting,  even  for 
diseases  of  much  importance.  Better  case  reporting  results  when  official  reporting  is 
restricted  to  those  diseases  for  which  control  services  are  provided  or  potential 
control  procedures  are  under  evaluation,  or  epidemiologic  information  is  needed  for  a 
definite  purpose"  (59). 

In  Canada,  specific  criteria  have  been  developed  for  determining  which  diseases  or 
conditions  should  be  reported  at  the  national  level  (Table  XII. 3)  (60).   In  practice, 
these  criteria  have  not  resulted  in  the  removal  of  any  diseases  from  the  list  of 
nationally  notifiable  diseases,  but  they  have  at  least  provided  a  systematic  basis  for 
deciding  among  diseases  proposed  for  addition. 

ANALYSIS  OF  DATA 

Most  of  the  analytic  issues  relevant  at  the  state  and  local  level  have  been  addressed 
elsewhere  in  this  book  (chapters  V  and  VI),  but  some  problems  encountered  in  analyses 
at  the  state  and  local  level  are  rarely  faced  at  the  national  level. 

Comparison  of  rates  in  different  geographic  areas  poses  particular  and  difficult 
problems  when  the  number  of  events  is  small  and/or  the  population  of  the  areas  is 
small.  When  analyzing  data  drawn  from  a  small  population,  particularly  for  an 
uncommon  event  or  from  a  subset  of  the  population  (e.g.,  when  calculating  age-  or 
race-specific  rates),  calculated  rates  may  be  difficult  to  interpret.   Unfortunately, 
it  is  difficult  to  say  with  certainty  what  population  size,  or  number  of  events,  is 
"too  small"  for  meaningful  analysis. 

Issues  involved  in  assessing  the  stability  of  rates  and  changes  in  rates  when  numbers 
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are  small  have  been  well  summarized  for  the  nonstatistician  (61).      For  example, 
confidence  intervals  for  rates  can  be  calculated  as  shown  in  Table  XII. 4.   In  general, 
rates  calculated  based  on  <20  events  will  have  a  95%  confidence  interval  approximately 
as  wide  as  the  rate  itself. 

Two  methods  for  comparing  independent  rates  (that  is,  rates  from  different,  non- 
overlapping  geographic  areas  or  from  a  single  area  at  two  different  nonoverlapping 
time  intervals)  have  been  suggested.  The  95%  confidence  interval  for  the  ratio  of  two 
independent  rates  can  be  calculated  using  the  formula  shown  in  Table  XII. 5.   The  two 
rates  differ  significantly  at  the  5%  level  if  the  95%  confidence  level  for  the  ratio 
of  the  two  rates  does  not  include  1 .   This  method  produces  valid  results  if  the  rate 
in  the  denominator  is  calculated  from  more  than  100  events.  The  95%  confidence 
interval  for  the  difference  between  two  independent  rates  can  be  calculated  using  the 
formula  shown  in  Table  XII. 6.   The  rates  differ  significantly  at  the  5%  level  if  the 
95%  confidence  interval  of  the  difference  between  the  two  rates  does  not  include  zero. 
Sometimes  the  two  methods  provide  contradictory  results;  if  that  occurs,  one  should 
conclude  that  the  rates  being  compared  are  not  significantly  different  (61). 

In  another  report,  four  age-adjusted  mortality  indexes  were  compared,  using  1969-1971 
U.S.  mortality  data  by  county,  for  counties  with  populations  of  >5,000.  On  the  basis 
of  coefficients  of  variation,  the  standardized  mortality  ratio  has  produced  stable 
results  for  mortality  data  from  all  counties  studied,  while  unacceptable  instability 
was  found  when  the  relative  mortality  index  was  applied  to  data  from  counties  with 
populations  of  <50,000.   Calculation  of  years  of  life  lost  from  all  causes  produced 
stable  results  when  applied  to  data  from  counties  with  populations  of  _>25,O0O  (62). 
The  stability  of  rates  for  specific  causes  of  death  remains  a  problem  for  small 
geographic  areas.   Methods  for  stabilization  of  rates  have  been  developed,  specifical- 
ly for  mapping  of  uncommon  events  such  as  suicide  or  specific  types  of  cancer  by 
county  (63,64). 

As  an  initial  step,  before  a  more  complicated  method  for  stabilization  of  rates  is 
applied,  aggregated  rates  should  be  compared  with  disaggregated  rates  (i.e.,  multiple 
years  versus  a  single  year;  state-wide  versus  county-wide;  and  entire  population 
versus  age-,  gender-,  or  race-specific  rates).  High  rates  in  geographic  areas  with 
small  populations—or  in  subsets  of  the  population- -may  be  due  to  chance,  particularly 
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if  the  elevated  rate  is  based  on  a  small  number  of  observed  cases.   Alternatively,  if 
increases  are  consistent  over  time--or  across  some  population  subgroups--it  is  more 
likely  that  they  represent  important  differences  rather  than  chance  occurrences. 

Other  events  deserve  attention,  even  if  only  a  single  case  occurs;  the  occurrence  of  a 
sentinel  health  event  represents  a  failure  somewhere  in  the  system  of  public  health  or 
of  health-care  delivery  and  warrants  careful  attention.   Such  sentinel  events  include 
maternal  and  infant  deaths  and  a  wide  variety  of  infectious  and  noninfectious  condi- 
tions (65)  . 

Intercensal  population  estimates  for  small  areas  are  available  from  a  variety  of 
sources.   Because  of  limited  availability  of  age-,  gender-,  and  race-specific  esti- 
mates from  the  U.S.  Bureau  of  the  Census  for  small  areas,  often,  state  governments 
have  developed  their  own  estimates  (66) .     Methods  for  interpolating  census  data  for 
estimation  of  small  area  populations  have  been  developed  (67)  . 

Methods  have  also  been  developed  for  defining  hospital  service  areas  in  metropolitan 
areas  (68).      Although  these  methods  have  most  commonly  been  used  in  studies  of  health- 
services  utilization  in  different  geographic  areas,  they  are  potentially  of  value  in 
analyses  of  data  generated  by  hospital-based  surveillance  at  the  state  or  local  level. 
Small-area  analyses  in  health-services  research  have  recently  been  reviewed  (69).     The 
statistical  issues  raised  by  these  studies  are  also  relevant  to  analyses  of  surveil- 
lance data  (70)  . 

Although  more  elaborate  techniques  have  been  described,  most  analyses  of  surveillance 
data  are  quite  simple — frequencies,  proportions,  and  rates--which  may  be  conveniently 
presented  in  tabular  form,  graphs  or  as  maps.   Indeed,  the  simplest  analyses—the 
number  of  births  to  teenagers  by  census  tract,  or  crude  death  rates  by  county--may  be 
the  most  useful  for  documenting  the  need  for  services.  Simple  analyses  should  be  done 
and  their  results  thoughtfully  considered  before  more  complicated  procedures  are 
undertaken.  By  far  the  most  common  error  made  in  analysis  of  surveillance  data  is 
failure  to  look  at  the  data. 

DISSEMINATION  OF  SURVEILLANCE:   STATE  AND  LOCAL 
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PERSPECTIVES 

Most  of  the  issues  relevant  to  the  dissemination  of  surveillance  data  at  the  state  and 
local  level  have  been  addressed  in  Chapter  VII.  The  role  of  newsletters,  annual 
reports,  and  press  releases  has  already  been  addressed,  as  has  the  importance  of  clear 
presentation  and  use  of  graphics.   Mapping  is  a  powerful  technique  for  presenting 
data.   Electronic  mail  systems  have  been  developed  in  some  states  to  facilitate  the 
dissemination  of  information  between  state  and  local  health  departments. 

RESOURCES  FOR  SURVEILLANCE  AT  THE  STATE  AND  LOCAL 
LEVEL 

No  model  system  for  surveillance  at  the  state  or  local  level  exists.  There  is  great 
variation  in  organizational  structure  of  state  and  local  health  departments,  and 
surveillance  activities  are  usually  closely  linked  to  disease-control  programs. 
Although  this  linkage  helps  assure  that  the  data  collected  will  indeed  be  used,  it 
complicates  efforts  to  document  the  resources,  personnel  and  other,  needed  for 
surveillance;  surveillance  cannot  be  readily  separated  from  other  related  activities. 

There  are  only  a  few  published  reports  that  address  the  cost  of  routine  surveillance 
systems  for  communicable  disease  in  state  health  departments.  The  cost  of  a  newly 
established  active  surveillance  system  that  surveyed  half  the  primary-care  physicians 
in  Vermont  was  estimated  to  be  $20,000  annually,  compared  with  $3,000  for  passive 
surveillance  (50) .  A  study  of  the  sentinel  active  surveillance  system  in  Los  Angeles 
County  estimated  that  the  additional  cost  of  weekly  contacts  made  with  selected 
hospitals,  physicians,  schools,  day-care  centers,  and  university  health  centers  was 
approximately  $7,000  per  year,  compared  with  an  estimated  $10,000  per  year  for  passive 
surveillance.  The  California  costs  reflected  student  instead  of  professional  staff 
time  and  did  not  include  time  expended  in  recording  reports  at  the  health  department 
(7).   In  1985,  the  Kentucky  Department  for  Health  conducted  active  surveillance  for 
hepatitis  A  infections  among  one-half  of  primary-care  practitioners  in  45  of  120 
counties  in  the  state.   The  22-week  active  surveillance  program  was  estimated  to  cost 
$5,616.   Although  the  system  was  cost-effective  overall,  because  the  administration  of 
immune  globulin  to  contacts  averted  an  estimated  $14,021  in  direct  medical  and 
indirect  costs  of  potential  subsequent  cases,  the  health  department  itself,  of  course, 
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incurred  increased  cost.   The  system  was  not  continued  after  the  study  was  completed 
(71)  . 

Higher  quality  data  on  cost  are  available  for  some  more  recently  developed  surveil- 
lance systems  at  the  state  level.  A  survey  of  24  state  and  metropolitan  health 
departments  that  conducted  surveillance  for  nutrition  in  1981  found  that  an  average  of 
16.6  hours  of  work  by  a  nutritionist  was  required  each  month  for  the  surveillance 
system.   Eight  and  one-half  hours  of  clerical  time  were  needed,  along  with  support 
from  statisticians,  computer  technicians,  and  others  {72). 

Data  collection,  coding,  and  entry  for  2,000  persons  with  injuries  seen  at  a  single 
hospital  participating  in  the  National  Electronic  Injuries  Surveillance  System  cost 
approximately  $7,000  in  1989  (12). 

Costs  of  the  BRFSS  are  shared  by  CDC  and  participating  state  health  departments 
through  cooperative  agreements.   In  1987,  the  cost  per  state  was  approximately 
$50,000,  or  approximately  $25-$30  per  completed  telephone  interview  (24). 

Part  of  the  Statewide  Childhood  Injury  Prevention  Project  (SCIPP)  in  Massachusetts 
involved  conducting  a  random-digit  telephone  survey.   Information  on  injuries  in  the 
previous  2  months  was  obtained;  because  of  the  relative  infrequency  of  these  events,  a 
large  sample  size  was  needed.  Twelve  hundred  households  were  contacted  at  a  cost  of 
$25,000,  yielding  reports  of  only  80  injuries,  most  of  which  were  falls  (73). 


More  complete  and  accurate  documentation  of  the  costs  of  surveillance--including  data 
analysis  and  dissemination--may  facilitate  funding,  particularly  in  the  current  era  of 
tight  constraints  on  state  budgets.  Explicit  discussion  of  costs  and  benefits  may 
help,  both  in  terms  of  protecting  (if  not  increasing)  funding  levels  and  assuring  that 
existing  surveillance  systems  are  necessary  and  make  the  best  possible  use  of  person- 
nel time. 

SUMMARY 

Public  health  surveillance- -the  systematic  and  ongoing  collection  of  data  pertinent  to 
public  health,  and  the  subsequent  analysis  and  dissemination  of  these  data--is  the 
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first  step  toward  action  in  public  health,  but  it  is  only  the  first  step.   A  number  of 
approaches  to  translation  of  data  into  action  have  been  developed,  with  emphasis  on 
the  local  level.   The  Assessment  Protocol  for  Excellence  in  Public  Health  (APEXPH) , 
developed  in  collaboration  with  the  National  Association  of  County  Health  Officers, 
guides  local  health  department  officials  through  identification  of  health  problems 
that  require  priority  attention  and  through  building  of  community  coalitions  for 
action  (74).  Such  an  approach  provides  a  good  foundation  for  adopting  community 
health  objectives  {75).     These  methods  have  been  very  successful  in  communities  that 
have  undertaken  them,  and  they  provide  useful  outlines  for  translating  information 
into  action  at  the  community  level.  For  example,  in  Tucson,  Arizona,  a  community 
coalition  targeted  for  action  the  high  rate  of  infant  mortality,  with  the  result  that 
a  new  program  to  provide  prenatal  care  was  established. 

Other  examples,  at  the  state  level,  are  readily  available.  National  studies  that 
found  that  residents  of  Delaware  died  at  high  rates  of  preventable  chronic  disease 
resulted  in  a  statewide  cancer  control  plan,  including  a  mobile  mammography  unit  for 
inner-city  neighborhoods.  Widespread  measles  outbreaks  occurred  in  New  York  State  in 
1989  among  high  school  and  college  students  who  had  been  previously  vaccinated. 
Surveillance  data  led  New  York  officials  to  reconsider  the  state's  vaccination 
strategy,  with  the  result  that  in  April  1989  New  York  became  the  first  state  in  the 
United  States  to  adopt  a  two-dose  schedule  for  routine  measles  vaccination  (76). 
Similarly,  surveillance  data  in  Tennessee  led  to  the  adoption  of  a  statewide  vaccina- 
tion requirement  for  children  who  attend  school  in  the  state  (Figure  XII. 1). 

The  competition  for  limited  dollars  and  for  the  attention  of  policy  makers  and  the 

public  is  intense.  The  challenge  is  to  identify  problems,  set  priorities,  and  to  work 

with  communities  to  develop  solutions.  More  than  ever,  it  is  important  to  use  data  to 

decide  among  competing  priorities  and  allocate  limited  resources- -the  most  important 

of  which  are  the  time  and  energy  of  the  public  health  practitioner  and  the  best 
interests  of  the  public. 
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Chapter  XIII 

Important   Surveillance   Issues   in 
Developing  Countries 

Mac  Otten 

"The  health  of  the  people  is  really  the  foundation  upon  which  all  their  happiness  and  all 
their  powers  as  a  state  depend.' 

Benjamin  Disraeli 

INTRODUCTION 

Previous  chapters  in  this  book  have  discussed  surveillance  largely  from  the 
perspective  of  developed  countries.  Although  the  issues  they  address  are  relevant  to 
all  nations,  developing  countries  have  unique  needs  and  opportunities.   The  health 
conditions  typically  associated  with  the  developing  world--diarrhea,  malaria, 
pneumonia,  and  malnutrition—occur  in  settings  with  only  rudimentary  health  care. 
This  chapter  highlights  a  number  of  surveillance  issues  relevant  to  developing 
countries,  including  resource  constraints. 

Although  conducting  surveillance  in  developing  countries  is  complex,  it  also  presents 
unique  opportunities.  Because  the  formal  health-care  system  is  often  an  integral  part 
of  organized  government  services,  there  are  fewer  impediments  to  implementing 
surveillance  systems.  The  limited  number  of  health-care  providers  and  diagnostic 
laboratories  reduces  the  number  of  data  sources,  which  can  facilitate  quality 
assurance.  Moreover,  acute  diseases  and  injuries  still  represent  major  causes  of 
morbidity  and  mortality  in  many  of  these  countries;  these  are  conditions  for  which 
surveillance  techniques  are  well-developed.   Finally,  communities  often  have  well- 
defined  health  systems  that  can  be  used  for  surveillance  purposes.   These 
opportunities  should  be  taken  when  feasible- -despite  such  obstacles  as  rudimentary 
record-keeping  systems  and  limited  resources,  numbers  of  diagnostic  laboratories, 
demographic  and  vital  information,  and  infrastructure. 

Four  issues  relating  to  surveillance  are  covered  in  this  chapter:  a)  planning,  b)  data 
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sources  (e.g.,  vital  statistics,  surveys,  and  sentinel  surveillance),  c)  surveillance 
at  the  local  level,  and  d)  development  of  integrated  surveillance  systems.   In  this 
chapter,  the  term  "local"  refers  to  the  health  station  (which  we  assume  to  be  the 
lowest  level  of  the  formal  health  system) ,  where  health  assistants  work.   In  addition, 
"population-based"  is  used  to  describe  information  for  all  persons  in  a  certain 
geographic  unit  as  opposed  to  facility-based  information,  which  may  represent  only 
persons  from  the  catchment  area  of  a  given  health  facility. 

PLANNING 

Identifying  Health  Objectives  and  Linkage  to  Surveillance 

Identifying  measurable  health  objectives,  assigning  them  priority,  and  then 
linking  surveillance  to  those  objectives  is  a  high-priority  activity  both  for  the 
surveillance  system  and  for  health-system  development  in  general  (1-3) .   Linking 
surveillance  to  these  ordered  health  objectives  alleviates  the  pitfall  of  thinking 
of  surveillance  as  just  the  reporting  of  disease  rather  than  as  a  system  that  uses 
information  from  multiple  sources  (such  as  sentinel  sites,  exit  interviews,  and 
regular  surveys) .   Linking  surveillance  to  objectives  will  help  planners  of  the 
surveillance  system  to  think  creatively  in  efforts  to  build  a  surveillance  system 
to  measure  all  priority  health  objectives.   Table  XIII. 1  lists  data  sources  that 
could  be  used  in  building  a  surveillance  system  in  a  developing  country. 

Throughout  the  world,  health  objectives  should  be  based  on  health  impact, 
feasibility  of  intervention,  and  cost-effectiveness  of  the  intervention.   In 
developing  countries,  measurable  health  objectives  often  cannot  be  identified 
because  high-quality,  population-based  mortality  data  are  often  missing.   As  a 
result,  estimates  of  mortality  and  health  outcome  from  such  international 
organizations  as  United  Nations  International  Children's  Emergency  Fund  (UNICEF) 
and  the  World  Health  Organization  (WHO),  international  conferences,  and  population 
laboratories  (e.g.,  International  Center  for  Diarrheal  Disease  Research, 
Bangladesh)  are  used.   Although  health  problems  are  similar  in  most  developing 
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countries  (Table  XIII. 2),  relying  on  data  from  other  countries  can  create  major 
problems,  especially  for  conditions  for  which  impact  is  not  clearly  known  (e.g., 
hepatitis  B,  iodine  deficiency,  or  malaria)  or  for  emerging  health  problems  (e.g., 
human  immunodeficiency  virus  [HIV]  infection,  tobacco  use,  and  motor-vehicle 
injuries) . 

The  need  for  country-specific  data  is  illustrated  by  the  finding  of  World  Bank 
analysts  that  oral-rehydration  therapy  (ORT)  in  low-mortality  environments  is  much 
less  cost-effective  than  passive  case  detection  and  short-course  chemotherapy  for 
tuberculosis,  whereas  ORT  in  high-mortality  environments  is  very  cost-effective 
(1).   The  cost-effectiveness  varies  by  a  factor  of  2  to  10,  depending  on  the  local 
situation. 

Health  objectives  should  focus  both  on  current  health  status  and  on  anticipated 
health  needs.   It  may  be  more  cost-effective  to  address  preventive  strategies 
(e.g. ,  early  bottle  feeding,  cessation  of  tobacco  use,  use  of  seat  belts,  and 
sanitation)  now  rather  than  when  the  impact  of  adverse  events  becomes  more 
apparent . 

For  each  health  objective,  the  surveillance  method  for  evaluating  that  objective 
and  its  sub-objectives  should  be  listed  (Table  XIII. 3).   Once  such  a  list  is  made, 
a  surveillance  grid  can  be  constructed  to  show  which  component  of  the  surveillance 
system  will  measure  which  objective  (Table  XIII. 4).   Completing  a  surveillance 
grid  helps  one  visualize  the  overall  structure  and  function  of  the  surveillance 
system. 

The  process  of  defining  objectives,  linking  objectives  to  surveillance  components, 
and  constructing  surveillance  grids  will  highlight  surveillance  needs.   The 
process  provides  a  basis  for  strengthening  existing  components,  for  identifying 
existing  information  that  could  measure  objectives,  and  for  developing  innovative 
new  surveillance  system  components.   For  example,  in  many  countries,  the  process 
of  linking  surveillance  to  objectives  highlights  the  need  for  mortality  data  and 
the  absence  of  vital  statistics. 
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Often,  the  most  important  objectives — the  reductions  in  mortality  associated  with 
diarrhea  and  measles--are  measured  in  sentinel  areas,  since  in  many  countries 
vital  events  are  not  registered  for  the  entire  country  (Table  XIII. 4).   Risk 
factors,  health- related  behavior,  and  health  interventions- -such  as  ORT  and  use  of 
fluids  at  home,  feeding  during  diarrhea,  use  of  contraception,  use  of  condoms,  use 
of  chloroquine,  missed  opportunities  for  vaccinations--can  be  measured  nationally 
with  regularly  scheduled  surveys.   Risk  factors  and  interventions  can  also  be 
identified  through  exit  interviews  at  the  district,  health-center,  health-station, 
or  village  level. 

Using  a  surveillance  grid  developed  for  a  hypothetical  country,  one  sees  that 
surveillance  for  HIV  is  not  as  straightforward  as  for  measles  and  diarrhea  (Table 
XIII. 4).   The  primary  health-status  outcome  chosen  by  this  country's  ministry  of 
health  was  not  HIV-related  mortality  or  acquired  immunodeficiency  syndrome  (AIDS), 
but  HIV  seroprevalence  in  selected  areas  and  selected  populations.   Therefore, 
sentinel  vital-event  registration  areas  will  not  be  used  to  measure  the  HIV- 
related  objectives.   In  addition,  the  objectives  for  HIV-related  risk  factors  and 
health  interventions  are  targeted  at  certain  areas  (areas  in  which  HIV 
seroprevalence  of  patients  with  sexually  transmitted  diseases  [STDs]  is  >10%) . 
Since  national  surveys  provide  estimates  only  for  the  country  as  a  whole,  national 
surveys  will  not  be  the  primary  method  for  measuring  progress  of  objectives 
related  to  risk  HIV  factors,  behavior,  and  health  interventions  at  a  state  or 
local  level. 

Examining  the  surveillance  system  as  a  whole  is  important  for  assigning  resources. 
For  diseases  such  as  measles,  diarrhea,  pneumonia,  and  pertussis,  surveillance 
traditionally  includes  measurement  of  mortality  in  vital  registration  and 
measurement  of  risk  factors  and  health  interventions  nationally  with  surveys  and 
locally  with  exit  interviews  (4).      However,  conditions  such  as  HIV,  malaria, 
malnutrition,  tuberculosis  (TB) ,  vitamin  A  deficiency,  and  hepatitis  B  can  be 
difficult  to  measure. 

Use  of  a  surveillance  grid  facilitates  the  integration  of  some  aspects  of 
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surveillance  and  may  increase  cost-efficiency.   For  example,  a  laboratory  team  may 
go  to  12  sentinel  sites  in  a  year  and  test  blood  for  HIV  from  pregnant  women  and 
patients  with  sexually  transmitted  diseases  (STDs),  blood  for  syphilis  serology 
from  20-  to  24-year-old  pregnant  women,  sputum  from  50  patients  with  cough  for  at 
least  1  month,  and  blood  smears  from  50  children  with  fever.   Efficiency  can  be 
gained  by  constructing  surveys--cluster  surveys  or  exit  interviews--that  integrate 
questions  about  priority  topics  such  as  diarrhea,  measles,  HIV,  tobacco  use,  and 
birth  spacing. 

Surveillance  of  Measures  of  "Outcome"  Versus  "Process" 

Currently,  at  national  and  global  levels,  much  emphasis  is  being  placed  on 
measurement  of  processes  (e.g.,  coverage  with  vaccinations)  versus  the  measurement 
of  health  outcomes  (e.g.,  cases  of  measles)  as  the  primary  focus(5).   Emphasis  is 
placed  on  process  measures,  in  part,  because  systems  for  efficient  measurement  of 
population-based  health  outcomes  do  not  exist. 

There  are  two  major  problems  with  process  measures.    First,  process  measures  do 
not  directly  measure  primary  events  of  interest — death  and  disease — or  the 
effectiveness  of  the  processes  (interventions) .   In  contrast,  the  health  outcome 
is  the  measure  of  interest,  and  what  is  measured  is  the  effectiveness  (i.e.,  the 
combined  effect  of  the  coverage  and  the  efficacy  of  the  intervention) . 

The  usefulness  of  a  process  measure  for  surveillance  depends  on  the  true  and 
consistent  effectiveness  of  the  intervention  being  measured.   Focusing  on  the 
measurement  of  processes  is  most  suitable  when  the  intervention  is  documented  to 
have  consistent,  high  effectiveness.   For  example,  measles  vaccine  administered  to 
a  9-month-old  infant  is  thought  to  be  90%  effective  in  preventing  subsequent 
measles  ( 6)  .   Therefore,  if  a  child  receives  measles  vaccine  before  being  exposed 
to  measles  virus,  the  probability  that  s/he  will  have  clinical  measles  is  very 
low. 
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The  difficulty  with  process  measurements,  however,  exists  even  with  an 
intervention  as  highly  effective  as  measles  vaccine  (e.g.,  children  infected  with 
measles  virus  before  vaccination  are  not  protected  by  vaccine) .   The  effectiveness 
of  most  interventions  is  often  less  than  that  of  measles  vaccine,  and  the 
effectiveness  of  the  delivery  of  such  interventions  varies  substantially  from 
setting  to  setting.   For  example,  on  the  basis  of  the  industriali zed-country 
experience,  three  doses  of  OPV  were  thought  to  have  an  effectiveness  of  at  least 
95%  in  all  settings  (7,8).   Yet,  recent  evaluations  of  field  vaccine  efficacy, 
reviews  of  serologic  efficacy,  and  outbreaks  in  countries  with  high  coverage  with 
OPV  have  shown  that  the  effectiveness  of  OPV  in  developing  countries  is  not  as 
high  as  in  industrialized  countries,  and  that  process  measures  of  OPV  coverage  can 
lead  to  a  false  sense  of  security  (9-12) . 

In  programs  in  which  an  intervention  has  high  and  consistent  effectiveness,  the 
magnitude  of  the  problem  of  using  process  measures  also  depends  on  the  stage  of 
development  of  a  program.   If  an  intervention  is  reliably  70%-90%  effective,  as 
are  measles  vaccine  and  OPV,  one  can  be  relatively  confident  that  health  outcomes 
will  be  positively  affected  if  coverage  increases  from  20%  to  80%.   However,  one 
cannot  be  at  all  confident  of  any  change  in  health  outcome  if  coverage  increases 
from  80%  to  90%  or  95%.   In  fact,  statistically  significant  changes  in  coverage 
from  80%  to  90%  or  90%  to  95%  cannot  be  detected  by  current  methods  of 
measurement . 

A  second  major  problem  with  process  measures  is  measurement  accuracy. 
Intervention  activities  are  often  measured  by  administrative  methods  and 
population-based  surveys.   An  example  of  the  administrative  method  of  estimating 
the  percentage  coverage  of  an  intervention  is  counting  the  number  of  vaccinations 
administered  and  then  dividing  by  some  denominator,  such  as  the  population  in  the 
catchment  area  <1  year  of  age. 

The  administrative  method  is  relatively  easy  and  cheap  to  perform  and  is  available 
locally.   On  the  other  hand,  both  the  numerator  and  the  denominator  are  often 
unavailable.   For  example,  to  estimate  the  percentage  of  persons  who  have  received 
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a  complete  series  of  OPV,  one  must  know  the  number  of  third  doses  of  OPV 
administered;  this  number  is  often  not  recorded. 

To  overcome  the  limitations  of  administrative  data,  population-based  surveys  are 
used  to  provide  process  measures  (e.g.,  the  percentage  of  persons  who  received  ORT 
during  the  most  recent  episode  of  diarrhea  and  the  percentage  of  reproductive- age 
women  who  use  modern  methods  of  family  planning),  especially  at  the  national 
level.   Yet,  there  are  increased  costs  associated  with  surveys  and  numerous 
potential  inaccuracies  from  current  survey  tools  (see  section  on  surveys  below) . 

Using  Outcome  To  Measure  Process 

In  any  international  setting,  surveillance  for  both  outcomes  and  processes  is 
desirable,  but  the  focus  of  surveillance  should  be  on  outcome  measures.   Outcome- 
based  programs  have  been  extremely  successful  for  global  progress  to  eradicate 
smallpox,  guinea  worm,  and  poliomyelitis.   The  smallpox  program,  which  started  out 
as  a  process-based  (coverage-driven)  program,  switched  to  an  outcome-based 
program,  which  led  to  improved  program  effectiveness  (13) .      An  outcome-based 
program  in  the  Americas  has  decreased  the  number  of  cases  of  poliomyelitis  from 
nearly  3,000  in  1980  to  a  handful  by  1990  (14) .      See  Appendix  XIII. A  for  a  more 
detailed  discussion. 


POPULATION-BASED  SURVEILLANCE 


Population-based  surveillance  is  especially  important  in  many  developing  countries 
because  of  the  disparities  of  access  to  health  facilities  and  health  status  in 
urban  centers  versus  rural  areas.   A  single  hospital  in  the  capital  city  often 
consumes  25%-50%  of  the  health  budget  for  an  entire  country.   Since  surveillance 
from  sentinel  sites  and  health  facilities  is  often  concentrated  in  urban  areas, 
public  health  needs  in  rural  areas  may  not  be  well -represented  by  policy  makers  at 
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the  national   level   unless  population-based  surveillance  systems  are  used. 

Vital -Event   Registration 

The  measurement  of  vital  events  is  the  most  important  single  addition  that 
developing  countries  can  make  to  their  existing  surveillance  system  (See  Chapter 
III) .   Death  and  birth  rates--along  with  cause-specific,  age-specific,  and  gender- 
specific  rates--are  very  useful.   In  the  United  States,  for  example,  13  of  the  18 
status  indicators  chosen  to  measure  the  health  status  of  the  population  as  part  of 
the  health  objectives  for  the  nation  will  be  measured  using  vital  records  (15)  . 

Why  so  little  emphasis  has  been  placed  by  developing  countries  on  establishing 
vital-event  registration  is  not  clear.   Registration  could  begin  in  small  sentinel 
areas,  could  be  evaluated  for  problems,  and  then  could  be  expanded.   The  vital - 
registration  system  in  the  United  States  started  in  1900  in  10  sentinel  states, 
and  it  took  23  years  for  all  states  to  be  admitted  into  the  system  (16) . 
Obviously,  in  the  early  stages  of  setting  up  a  registry,  some  births  and  deaths 
would  be  missed.   As  late  as  1974-1977,  21%  of  neonatal  deaths  were  not  registered 
in  Georgia  (17);  despite  this  underregistration,  vital  data  have  been  extremely 
useful. 

In  areas  in  which  routine  mortality  data  are  not  available,  the  verbal  autopsy,  in 
which  trained  or  untrained  workers  take  histories  from  family  members  to  classify 
deaths  by  cause  is  a  useful  technique  (18) .      In  1978,  WHO  published  a  monograph 
called  Lay  Reporting  of  Health  Information    (19) .      It  contained  a  detailed  list  of 
approximately  150  causes  of  death  and  a  minimal  list  of  30  causes  that  could  be 
used  by  non-physicians  to  classify  deaths  by  cause. 

In  establishing  vital-event  systems,  consideration  should  be  given  to  including 
the  registration  of  pregnancy.   This  is  especially  needed  to  measure  the  number  of 
neonatal  deaths,  which  in  turn  is  needed  to  allow  accurate  infant -mortality  rates 
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to  be  calculated.   Registration  of  pregnancies  would  allow  measurement  of  prenatal 
care,  fetal  death  associated  with  syphilis,  family  planning,  and  other  important 
health  concerns. 

Regular,  Periodic  Surveys 

Regular,  periodic  surveys  can  be  an  important  component  of  a  surveillance  system. 
In  particular,  cluster  surveys--multi-stage  surveys  with  primary  sampling  units — 
are  important  surveillance  tools  in  many  developing  countries  because  they  are  the 
only  feasible  method  of  collecting  population-based  information  (20) . 

Cluster  surveys  have  not  been  thought  of  as  an  essential  and  regularly  performed 
surveillance  activity.   Surveys  have  generally  been  single-purpose  and  have  been 
conducted  intermittently  on  an  as-needed  basis,  often  at  the  request  of 
international  organizations.   However,  because  the  survey  is  the  only  method  of 
gathering  population-based  information  in  many  countries  and  surveys  can  be  used 
to  collect  information  on  a  variety  of  health  topics,  regularly  scheduled  surveys 
can  constitute  an  excellent  surveillance  tool  (see  Behavioral  Risk  Factor 
Surveillance  in  Chapter  III). 

To  assure  the  development  of  a  useful  national  surveillance  system  in  a  developing 
country,  a  survey  unit  or  survey  person  should  be  assigned  the  task  of 
coordinating  all  national  health  surveys.   The  coordinator  first  works  with 
program  staff  to  develop  surveillance  questions  in  high-priority  areas  (e.g., 
diarrhea,  vaccinations,  HIV/AIDS,  family  planning,  child  survival,  malaria,  and 
tuberculosis).   Two  to  five  questions  are  often  adequate  for  some  conditions.   The 
questions  should  be  assigned  priority  so  that  the  survey  coordinator  has  some 
flexibility  to  shorten  the  overall  questionnaire  if  needed. 


Previously  conducted  surveys  can  serve  as  models  for  adaptation  to  local 
situations.   For  example,  for  vaccination-related  questions,  the  Expanded 
Programme  on  Immunization  (EPI)  at  WHO  has  a  useful  module.  WHO  also  has  useful 
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questionnaires  for  diarrhea;  acute  respiratory-tract  infections;  and  knowledge, 
attitude,  and  behavior  associated  with  HIV  infection.   The  Centers  for  Disease 
Control  (CDC)  has  questionnaires  on  child  mortality,  health-station  practices, 
nutrition,  HIV  risk  behavior  among  youths,  and  others. 

Once  questionnaire  modules  have  been  developed,  each  module  should  be  field  tested 
for  readiness  for  implementation.   Advance  preparation  and  testing  are  very 
important;  it  is  both  difficult  and  time-consuming  to  develop  an  effective 
questionnaire. 

A  small  set  (10  or  so)  of  core  questions  measuring  the  highest-priority  objectives 
should  be  included  in  every  survey.   Some  space  should  be  reserved  for  last-minute 
questions  on  information  desired  by  high-level  policy  makers.   Not  only  will  this 
demonstrate  the  timeliness  of  this  surveillance  component,  but  it  might  facilitate 
political  and  financial  support  for  its  continuation.   Finally,  when  the  time 
comes  for  a  survey,  the  survey  coordinator  puts  together  the  core  questions,  the 
last-minute  questions  from  the  policy  makers,  and  the  appropriate  survey  modules. 


Data  collection  desired  by  international  organizations  can  be  integrated  into  the 
ministry  of  health's  schedule  of  surveys.   The  survey  coordinator  can  provide  the 
international  organization  that  wishes  to  have  a  survey  conducted  with  the 
schedule  and  proposed  modules  to  be  used.   The  two  groups  can  then  collaborate  to 
determine  how  the  needs  of  both  groups  could  be  met.   The  international  group  can 
help  train  survey-unit  staff  and  can  help  maintain  a  training  manual  on  designing 
and  conducting  a  survey,  including  interviewing  techniques.  This  method  is  a 
cost-effective  way  to  build  local  capacity  and  facilitate  sustainability .   See 
Appendix  XIII. B  for  a  discussion  of  some  statistical  issues  in  cluster  surveys. 

S>aatia<sl  Smrveillaac® 

Sentinel  surveillance  at  health  facilities  can  play  a  critical  role  in 
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surveillance  in  developing  countries.   Sentinel  sites  are  used  to   a)  collect 
important  information  not  collected  at  all  sites  and  b)  pilot  collection  of  new 
information  in  order  to  be  able  to  assess  the  usefulness  of  the  data  and  the 
method  of  collection.   Since  routinely  reported  information  from  all  sites  must  be 
restricted  to  high-priority  items  and  must  be  easy  to  collect,  much  important 
information  is  unlikely  to  be  collected  from  all  health  facilities. 

At  sentinel  sites,  more  resources  and  more  experienced  and  dedicated  personnel  can 
often  be  used  to  collect  information  on  more  diseases,  more  detailed  information 
about  each" case,  and  more  difficult-to-collect  information  such  as  sexual 
behavior.  Also,  sentinel  sites  can  often  serve  as  sources  of  information  about 
new  conditions  and  can  be  used  to  determine  the  most  effective  methods  for 
inserting  newly  required  data  into  the  routine  collection  system. 

There  are  several  potential  problems  in  interpreting  data  from  sentinel  sites. 
Sentinel  sites  are  often  hospitals  or  other  sophisticated  facilities  and  tend  to 
serve  urban  patients.   Such  data  will  not  reflect  rural,  small,  non-urban  health 
stations  where  the  majority  of  the  population  may  live.   Consequently,  rural  and 
small  health  stations  should  be  in  the  sentinel-site  system. 

Nevertheless,  for  several  reasons,  hospitals  as  sentinel  sites  and  hospitals  in 
urban  areas  can  yield  important  information  in  a  timely  manner  at  a  relatively  low 
cost:   first,  cause-of -death  data  are  available,  permitting  timely  data  collection 
and  analysis;   second,  because  the  number  of  visits  and  deaths  is  large,  they 
yield  more  precise  estimates  and  allowing  subgroup  analysis  by  age,  gender,  or 
other  important  variables.  Also,  data  are  currently  available,  whereas  systems  of 
vital  events  and  regular,  periodic  surveys  are  not  generally  established.   For 
example,  in  Kinshasa,  Zaire,  the  Ministry  of  Health  used  a  hospital -based  sentinel 
surveillance  system  to  establish  that  measles  remained  an  important  cause  of  death 
for  children  <9  months  old.   The  spread  of  clinically  important  resistance  to 
chloroquine  was  detected  because  of  increasing  mortality  from  malaria  in  sentinel 
hospitals  in  numerous  African  countries  (21) . 
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Surveillance  at  the  Local  Level 

Integrated,  well-thought-out  surveillance  at  the  health-station  and  health-center 
level  warrants  more  focused  attention;  especially,  data-collection,  analysis,  and 
dissemination  of  results  as  a  basis  for  public  health  action.   Surveillance 
responsibilities  should  be  specified  in  employee  work  plans  and  completion  of 
surveillance  duties  used  to  assess  health-worker  performance. 

WHO  has  surveillance  and  evaluation  training  modules  for  vertical  programs  such  as 
EPI  and  Control  of  Diarrheal  Diseases  (CDD)  (20,22,23) ,  but  there  are  no  general 
surveillance  training  modules  for  district  or  health-station  levels.   Local 
surveillance  is  critical  because  major  health  problems  in  developing  countries 
require  innovative  public  health  action  at  the  local  level.   Local  surveillance 
and  public  health  action  based  on  surveillance  may  be  less  urgent  for  programs 
with  high  effectiveness  and  ease  of  administration,  (e.g.,  vaccinations),  or  for 
programs  that  depend  solely  on  the  formal  health-care  system  (e.g.,  acute 
respiratory  infections  or  tuberculosis) .   However,  local  surveillance  and  linked 
public  health  action  will  be  essential  for  most  of  the  priority  diseases  (e.g., 
diarrhea,  malaria,  and  HIV)  and  related  prevention  activities  (oral -rehydration 
solutions,  chloroquine  for  all  cases  of  fever,  and  condoms) .   In  general,  these 
interventions  require  extensive  behavior  change  on  the  part  of  clients  and  also 
require  local  problem-solving,  surveillance  of  objectives,  strategy  reformulation, 
and  creative  intervention  by  health  workers  to  be  successful. 


Collection,  Display,  and  Analysis  of  Local  Surveillance  Data 

Analysis  of  surveillance  data  and  action  based  on  that  surveillance  information  at 
the  local  level  have  several  benefits.   If  collected  data  are  prominently 
displayed  as  tables  and  graphs  in  the  local  health  office,  public  health  personnel 
(and  patients)  can  see  the  results  of  data-collection  efforts.   Through  the 
analysis  and  interpretation  of  the  displayed  surveillance  data,  local  staff  can  be 
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involved  in  the  process  of  devising  strategies  to  solve  health  problems  and  at  the 
same  time,  can  help  attain  national  and  local  health  objectives.  Such  involvement 
gives  health  staff  a  sense  of  participation  and  professionalism. 

The  process  of  designing  a  surveillance  system  for  a  district  or  a  health-station 
is  the  same  as  for  the  national  level.   First,  health  priorities  are  determined  on 
the  basis  of  the  impact  of  the  health  problem  and  the  feasibility  and  cost- 
effectiveness  of  intervention.   Second,  objectives  are  determined  and  assigned 
priority.   Third,  surveillance  components  to  measure  high-priority  objectives  are 
identified*  and  implemented. 

Four  differences  between  national  and  local  surveillance  sometimes  emerge.   First, 
many  health  stations  will  not  have  mortality  surveillance  based  on  vital-event 
registration,  whereas  national  surveillance  systems  may  include  at  least  a 
sentinel-registration  component.   However,  health  stations  can  begin  sentinel 
population-based  mortality  surveillance  by  starting  vital-event  registration  in 
one  or  two  villages. 

Second,  30-cluster  surveys  conducted  regularly  every  1-3  years  are  not  feasible 
for  district  and  health-station  surveillance  of  risk  factors  and  health 
interventions . 

Third,  resource  constraints  at  the  local  level  limit  the  number  of  sentinel  sites. 
However,  both  health  stations  and  districts  can  conduct  a  form  of  sentinel 
surveillance  by  limiting  data  collection  on  some  health  problems  to  a  small  sample 
of  sites  at  infrequent  intervals.   For  example,  although  children  have  their 
growth  monitored  throughout  the  year,  the  percentage  with  weight-f or-age  of  <80% 
of  standard  might  be  calculated  only  once  every  3  months  on  a  consecutive  sample 
of  30  children. 

Fourth,  limited  resources  require  integration  of  surveillance  and  non-surveillance 
health  information  by  local  health  workers. 


307 

Data  collected  routinely  by  health  stations  should  be  limited  to  high  priority 
conditions.   For  example,  mandatory  reporting  could  be  limited  to  10  selected 
diseases  on  the  basis  of  established  priorities  or  reporting  laws.   In  addition, 
the  health  station  should  meet  certain  standards  before  reporting  requirements  are 
expanded:  the  health  station  staff  should  be  a)  reporting  regularly,  b)  displaying 
information  collected,  c)  thinking  about  the  meaning  of  the  data,  d)  using  the 
data  to  solve  health  problems,  and  e)  using  the  data  to  evaluate  programs  targeted 
at  certain  health  problems.   If  these  are  all  being  done,  the  staff  is  likely  to 
become  enthusiastic  about  the  public  health  aspect  of  the  station's  job  and 
initiate  the  idea  of  collecting  more  information.   For  example,  information  for 
each  case-patient  (e.g.,  age  and  date  of  onset  of  disease)  can  be  collected  for 
selected  health  problems  instead  of  just  reporting  the  number  of  cases  of  disease 
(i.e.,  summary-count  data).   Additional  diseases  can  be  added  on  the  basis  of 
priority  setting  (e.g.,  AIDS  or  moderate  and  severe  malnutrition).   The  practice 
of  collecting  data  intermittently  for  special  purposes  can  be  expanded,  and  data 
items  found  to  be  useful  at  sentinel  sites  can  be  added  to  reportable  conditions 
from  all  health  stations  or  at  least  can  be  expanded  to  a  larger  number  of 
sentinel  sites. 

Display  and  interpretation  of  surveillance  data  and  planned  action  based  on  the 
interpretation  can  be  integrated  into  assigned  duties  of  health  workers  and  into 
the  duties  of  their  supervisors.    Each  health  worker  should  have  a  detailed  task 
analysis  or  job  description,  with  the  task  analysis  linked  to  national  and  local 
health  objectives. 

Employee  and  project  work  plans,  based  on  supervisory  visits  and  on  input  from 
members  of  the  community,  should  also  reflect  health  objectives  and  ongoing 
analysis  and  interpretation  of  surveillance  data.   For  example,  if  one  of  the 
high-priority  health  objectives  is  the  reduction  of  measles  cases  by  50%  as  of 
1995  (compared  with  the  1989-1991  baseline)  and  the  graphs  of  measles  cases  by 
year  and  measles  cases  by  month  in  1993  show  no  decline,  the  work  plan  for  the 
next  6  months  might  include  conducting  exit  interviews,  collecting  additional 
information  on  cases,  and  convening  focus  groups. 
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Through  focus  groups,  health  workers  can  determine  from  groups  of  mothers  why 
children  are  not  being  vaccinated  and  what  might  be  done  to  solve  this  problem. 
Exit  interviews  can  be  used  to  determine  measles  coverage.   Additional  information 
about  the  ages  of  persons  with  measles  can  be  recorded  for  the  next  6  months,  and 
then  the  health  worker  and  supervisor  can  determine  whether  measles  is  a  disease 
primarily  among  infants  or  among  older  persons  as  well.   Using  the  vaccination 
status  of  persons  with  measles,  health  workers  can  estimate  measles  coverage.   The 
effectiveness  of  a  work  plan  should  then  be  evaluated  both  through  continued 
surveillance  of  measles  cases  and  through  exit  interviews. 

In  addition,  the  6-month  work  plan  could  include  teaching  mothers  about 
appropriate  preparation  and  use  of  oral-rehydration  fluids  at  home.   During  a 
supervisory  visit,  the  supervisor  can  do  exit  interviews  of  30  consecutive  women 
seen  at  the  health  station  and  record  whether  and  what  they  have  been  taught  about 
using  fluids  at  home,  possibly  asking  for  demonstration  of  what  they  have  been 
taught.   At  the  same  exit  interviews,  receipt  of  measles  vaccine  can  be  recorded 
as  a  measure  of  coverage.   This  will  integrate  surveillance  for  measles  coverage 
with  direct  health-worker-performance  assessment  of  a  diarrhea-related  task. 

Exit  Interviews  and  Focus  Groups 

Interviews  of  patients  who  have  finished  their  visits  at  health  facilities,  which 
can  be  called  "exit  interviews,"  can  be  a  flexible,  easy,  and  cost-effective 
method  of  collecting  information.   Exit  interviews  are  ideal  for  measuring 
progress  toward  local  health  objectives.   They  can  be  used  to  collect  data  for 
emergent  problems  or  for  routine  surveillance,  as  well  as  to  evaluate  the 
performance  of  health  workers.   For  surveillance  purposes,  exit  interviews  can  be 
used  to  collect  information  about  "process"  health  objectives,  health  risks, 
health  behavior,  and  health  interventions.   Unlike  surveys,  exit  interviews  can 
be  conducted  frequently.   Supervisory  visits  provide  an  excellent  opportunity  to 
involve  the  supervisor  in  the  conduct  of  exit  interviews. 
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Focus  groups  can  make  important  contributions  to  the  design  of  a  surveillance 
system.   As  complex  issues  such  as  changes  in  behavior  are  assigned  higher  health 
priorities  (e.g.,  HIV-related  behavior,  diet,  home  fluids,  treatment  practices, 
and  reasons  for  not  being  vaccinated) ,  focus  groups  are  often  used  to  gain  new 
information. 

Focus  groups  often  provide  an  appropriate  first  step  in  generating  ideas  about  why 
events  and  behavior  occur.   After  ideas  or  hypotheses  are  available,  surveys,  exit 
interviews,  and  special  studies  (case-control  studies)  can  be  used  to  identify 
specific  factors  that  should  be  incorporated  into  surveillance  systems.   Health- 
station  staff  can  use  focus  groups,  along  with  exit  interviews,  to  measure  health 
objectives  of  local  importance. 


BUILDING  INTEGRATED  SURVEILLANCE  SYSTEMS 


Over  the  last  15  years,  the  sophistication  of  public  health  in  developing 
countries  has  increased  greatly.    EPI  provided  one  model  for  surveillance. 
However,  surveillance  for  measles  was  relatively  easy--the  intervention  was 
consistently  and  highly  effective,  and  almost  all  infections  caused  a  distinct, 
noticeable  condition.   However,  the  EPI  surveillance  model  was  not  as  successful 
for  problems  such  as  diarrhea,  pneumonia,  family  planning,  and  malaria,  where  the 
interventions  were  less  effective  or  less  consistently  effective  and  where  the 
outcome  of  interest  was  more  difficult  to  measure. 

Then,  HIV  appeared.   Reporting  of  cases  of  AIDS  was  inadequate  for  immediate 
prevention  because  of  the  lengthy  incubation  period  for  this  condition.   Accurate 
surveillance  for  HIV  had  to  rely  on  expensive  laboratory  testing. 

Of  the  top  10  priority  diseases  in  developing  countries,  only  tuberculosis  and 
malaria  require  any  laboratory  testing  (at  least  sentinel  testing)  for 
surveillance,  and  the  diagnostic  tests  for  malaria  and  tuberculosis  (though  not 
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the  tests  for  antimicrobial  resistance)  are  relatively  simple  and  inexpensive.   In 
addition,  the  appearance  of  HIV  put  new  emphasis  on  the  need  for  surveillance  of 
types  of  health  behavior,  the  main  prevention  focus  for  HIV.   Previously, 
surveillance  had  been  considered  to  be  adequate  in  developing  countries  if  it 
covered  disease  reporting  and  vaccination  coverage. 

Now,  surveillance  data  are  expected  to  be  available  on  risk  factors  and  health 
behavior  (e.g. ,  age  at  marriage  and  age  at  first  sexual  intercourse  for  family- 
planning  purposes) ,  as  well  as  on  such  newly  important  diseases  as  hepatitis  B, 
genital  ulcer  disease,  urethritis,  use  of  tobacco,  and  injuries  associated  with 
motor  vehicles. 

As  public  health  programs  become  more  sophisticated  and  public  health  workers  need 
access  to  more  information  on  more  and  more  conditions,  the  complexity  of  the 
structure  of  surveillance  systems  will  increase.  The  integration  of  surveillance 
and  evaluation  for  vertical  programs  such  as  EPI,  diarrhea,  acute  respiratory 
infections,  HIV/STD,  and  family  planning  into  a  coherent,  rational  surveillance 
system  will  depend  on  the  actions  taken  by  ministries  of  health. 

There  are  several  advantages  to  integration: 

surveillance  information  can  be  gathered  with  greater  cost-efficacy, 
requirements  for  health-station  staff  will  be  simplified  and  their 
training  will  be  less  duplicative. 

Although  international  organizations,  often  supporting  vertical  programs,  control 
a  substantial  proportion  of  the  resources  being  spent  on  public  health  in 
developing  countries,  these  organizations  are  likely  to  respond  favorably  to  the 
implementation  of  logical,  well-crafted,  integrated  surveillance  systems  that  are 
linked  to  written  national  health  priorities. 

Surveillance  systems  must  continually  focus  on  outcomes  (cases  of  the  health 
problem)  in  order  to  adjust  strategies  and  interventions  for  control  and 
prevention.   Many  countries  are  trying  to  reach  low  levels  of  vaccine-preventable 
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diseases  by  the  year  1995  (measles  and  neonatal  tetanus)  or  eradication  by  the 
year  2000  (poliomyelitis) (24) .      The  poliomyelitis  eradication  initiative  attempted 
to  demonstrate  that  outcome-based  surveillance  intimately  linked  to  intervention 
can  be  the  "leading  wedge"  in  disease  reduction. 

The  sophistication  of  the  tools  available  in  developing  countries  to  analyze 
surveillance  data  has  also  increased.   Surveillance  data  have  been  analyzed  with 
computers  at  the  national  level  for  the  past  several  years.   As  the  prices  of 
computer  hardware  have  continued  to  decrease,  computers  have  been  moved  to  zonal, 
state,  and  provincial  levels.  Epi   Info,    an  inexpensive  and  freely  copyable 
epidemiology  computer  program,  is  now  available  in  English,  French,  Spanish,  and 
Arabic  (25);  also,  manuals  are  available  in  Czech  and  Italian.  Mapping  of 
surveillance  data  has  been  underutilized  because  inexpensive  mapping  programs  that 
can  display  maps  by  district,  health  station,  and  village  and  can  be  linked  to 
surveillance  data  bases  have  not  been  available.   However,  a  mapping  program 
called  Epi  Map   is  compatible  with  Bpi  Info   and  can  create  maps  of  surveillance 
data  automatically. 

SUMMARY 


The  vision  for  surveillance  systems  in  developing  countries  as  described  above 
involves  systems  that  are  linked  to  health  objectives,  ordered  by  priority, 
limited  in  scope,  and  not  burdensome  at  the  health-station  level.   These  systems 
should  also  contain  an  extensive  sentinel  network  and  have  strong  elements  of 
population-based  data  gathering  from  surveys  and  vital  event  registration. 
Surveillance  data  need  to  be  collected  routinely.   Sentinel  sites  will  provide  the 
information  required  to  monitor  health  objectives,  but  such  surveillance  should 
also  be  flexible  enough  to  collect  new  data  needed  for  emerging  problems,  and  for 
changing  priorities. 

Health  objectives  provide  national  politicians  and  health  leaders  a  plan  to 
ensure  the  public's  health.  With  a  surveillance  system  that  is  linked  to  these 
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objectives,  leaders  will  be  able  to  monitor  progress  made  toward  meeting  national 
objectives.   With  analysis  and  action  at  the  district  and 

health-station  level,  local  health  staff  can  take  rapid  and  appropriate  action. 
Population-based  vital  statistics  can  show  whether  enough  emphasis  is  being  placed 
on  health  in  rural  and  remote  areas  of  a  country.   Health  surveys  can  be  conducted 
as  a  regular  part  of  the  surveillance  system.   Expertise  and  funding  provided  by 
international  organizations  can  help  train  and  maintain  a  survey  coordinator  and 
surveyors . 

In  implementing  surveillance  and  health  systems,  developing  counties  can  avoid  the 
mistakes  that  industrialized  countries  have  already  made--poorly  planned  and 
fragmented  surveillance  systems,  surveillance  systems  not  linked  to  objectives, 
health  objectives  that  are  not  explicit  and  often  politicized,  large  divisions 
between  curative  and  preventive  medicine,  and  differences  in  health  care  in  rural 
versus  urban  areas . 

As  noted  at  the  beginning  of  this  chapter,  surveillance  in  developing  countries  is 
accompanied  by  numerous  logistic  problems  but  also  presents  unique  opportunities. 
The  careful  setting  of  health  priorities  and  the  meticulous  allocation  of  limited 
resources  to  the  interests  of  the  public's  health  can  be  the  results  of 
surveillance  in  such  settings. 
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Appendix  XIII. A.   Using  Outcome  To  Measure  Process 

This  appendix  describes  a  method  to  estimate  process  measures  from  outcome 
measures.   Some  process  measures  such  as  percentage  coverage  of  an  intervention 
(e.g.,  percentage  using  chloroquine,  percentage  having  received  vaccine, 
percentage  using  ORT)  may  be  cost-effectively  assessed  by  outcome  data  (e.g., 
number  of  cases  of  malaria,  cases  of  measles,  deaths  from  diarrhea) .   There  is  a 
relationship  between  the  proportion  of  persons  with  a  disease  that  has  "received" 
an  intervention,  the  effectiveness  of  the  intervention,  and  the  "coverage"  of  the 
intervention  in  the  population.  The  relationship  is  as  follows: 

PPI-(PPI*Eff) 
PCI  =    


l-(PPI*Eff) 

where  PCI  is  the  percentage  of  the  cases  of  disease  exposed  to  the  intervention, 
where  PPI  is  the  percentage  of  the  population  exposed  to  the  intervention,  and 
where  Eff  is  the  efficacy  of  the  intervention. 

This  formula  is  derived  from  the  formula  for  program  (vaccine)  efficacy,  where 
efficacy  equals  the  attack  rate  among  persons  not  exposed  to  the  program  or 
intervention  minus  the  attack  rate  among  persons  exposed,  divided  by  the  attack 
rate  among  those  unexposed  (26),    i.e.,  for  vaccine  efficacy,  Eff  =  VE  or  vaccine 
efficacy;  PCI  =  PCV  or  percentage  of  case-patients  who  are  vaccinated;  and  PPI  = 
PPV  or  percentage  of  the  population  vaccinated. 

The  graphic  representation  of  this  formula  is  known  in  immunization  programs  as 
the  vaccine-efficacy  curve  (Figure  XIII .A.l) (26) .      As  an  example,  if  the 
percentage  of  case-patients  with  disease  that  have  been  exposed  to  the 
intervention  (PCI)  is  <20%,  the  coverage  of  the  intervention  in  the  population 
(PPI)  is  poor  (i.e.,  the  efficacy  of  the  intervention  is  90%  or  less).   If  the 
proportion  of  case-patients  who  have  received  the  intervention  is  >50%,  either  the 
percentage  coverage  is  high  or  the  efficacy  of  the  intervention  is  low.   To 
estimate  from  surveillance  the  coverage  of  cases,  one  needs  to  determine  whether 
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persons  with  the  disease  were  or  were  not  exposed  to  a  particular  intervention 
(e.g.,  whether  case-patients  used  condoms,  whether  case-patients  received 
appropriate  home  fluids,  or  whether  case-patients  received  vaccine) . 

To  use  the  formula  or  the  curve,  the  exposure  to  the  intervention  must  be 
dichotomized  into  a  "yes/no"  format.   For  example,  for  poliomyelitis,  exposure  is 
categorized  into  "fully  vaccinated"  with  >3  doses  of  vaccine  and  "not  fully 
vaccinated"  with  <3  doses  of  vaccine.   This  method  has  several  advantages.   It 
allows  estimates  of  coverage  at  the  health-station  level,  which  allows  local 
action  to  solve  local  health  problems.   It  is  much  simpler  and  cheaper  than 
conducting  surveys,  it  provides  information  about  effectiveness  as  well  as 
coverage,  and  it  is  more  difficult  to  falsify  than  coverage-survey  and 
administrative  method  estimates.   However,  this  method  provides  only  a  crude 
estimate  and  should  be  used  with  other  sources  of  data.   For  example,  if  the 
survey  or  administrative  estimate  of  0PV3  coverage  is  95%,  and  only  20%  of 
confirmed  poliomyelitis  case-patients  received  3  doses  of  OPV,  then  the  survey  or 
administrative  estimates  should  be  questioned. 
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Appendix  XIII. B.   30-Cluster  EPI  Survey  Design 

In  the  absence  of  an  internationally  funded  survey  to  attach  modules  or  questions 
desired  by  a  ministry  of  health,  a  30-cluster  EPI  survey  can  be  performed  (20) . 
The  EPI  survey  was  designed  to  provide  a  crude  estimate  of  vaccination  coverage 
(±10%)  (27);  it  provided  information  about  whether  vaccination  coverage  was  low 
(20%-40%)  or  relatively  high  (70%-90%) .  Other  programs  have  adapted  the  design 
for  other  purposes  (e.g.,  mortality  from  neonatal  tetanus,  mortality  and  practices 
associated  with  diarrhea,  and  changes  in  vaccination  coverage  over  time)  (22,23). 

However,  results  have  often  been  misleading  because  appropriate  confidence 
intervals  were  not  calculated.   Many  health  professionals  did  not  realize  that  the 
confidence  interval  for  each  survey  was  not  fixed  at  ±10%  but  varied  depending  on 
the  results  (inter-cluster  correlation  and  the  point  estimate)  of  each  survey. 
Often  confidence  intervals  were  not  calculated  and  appropriate  analyses  of 
subgroups  (males,  females)  were  not  done  because  easy-to-use  computer  programs 
were  not  available.   Fortunately,  such  computer  programs  as  (COSAS;  Lotus 
spreadsheet  for  diarrhea  cluster  surveys;  and  CLUSTER,  which  runs  within  Epi   Info) 
are  now  available  to  calculate  appropriate  confidence  intervals.   However,  if  an 
analysis  by  age,  by  gender,  or  some  other  specific  characteristic  is  desired,  a 
more  complicated  program  (e.g.,  SUDAAN  or  CARP)  still  must  be  used  to  obtain  valid 
point  estimates  and  valid  confidence  intervals  (28) .      For  example,  one  cannot  get 
a  valid  estimate  of  coverage  for  males  and  females  in  a  typical  EPI  coverage 
survey  without  the  use  of  SUDAAN. 

As  the  use  of  the  cluster  survey  becomes  more  sophisticated  and  as  greater 
accuracy  and  precision  is  desired,  use  of  the  EPI  cluster-survey  design  is 
complicated  by  the  potential  for  bias  in  both  selection  of  the  first  house  and 
subsequent  selection  of  additional  houses  (29) .      Despite  being  designed  and 
analyzed  as  a  survey  with  equal  probability  of  selection,  selection  of  the 
starting  house  from  a  randomly  selected  direction  yields  a  higher  probability  of 
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selection  for  houses  near  the  middle  of  the  cluster.   If  occupants  near  the  middle 
of  the  cluster  have  some  characteristic  associated  with  the  outcome  (e.g.,  have 
higher  incomes),  a  biased  estimate  will  result. 

An  alternative  method  of  selecting  the  first  and  additional  houses  in  a  cluster  is 
by  segmenting  and  subsegmenting  the  cluster  until  a  small  number  of  houses  can  be 
mapped  (e.g.,  30  houses).   Then,  the  first  and  additional  houses  can  be  chosen  at 
random.  Tf   one  assumes  that  the  number  of  target-group  persons  per  household  is 
similar  in  all  clusters,  valid  point  estimates  and  approximate  confidence 
intervals  can  be  calculated  using  less-complicated  programs  (CLUSTER  and  COSAS) . 
The  use  of  subsegmenting  in  the  absence  of  being  able  to  select  the  first  house 
randomly  has  also  been  described. 

An  easy-to-use  program  that  appropriately  analyzes  cluster  surveys  (including 
appropriate  analysis  of  subgroups  and  comparison  of  two  independent  surveys  done 
at  two  different  times)  operating  within  Epi   Info   is  being  prepared. 
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problem,  Tennessee 

Chapter  XIII: 

Figure  XIII. A. 1.  Percentage  of  case-patients  vaccinated 
(PCV)  per  percentage  of  population 
vaccinated  (PPV)  for  seven  values  of 
vaccine  efficacy  (VE) 


Table  1.1.   The  uses  of  surveillance  [23) 

Quantitative  estimates  of  the  magnitude  of  a  health  problem. 

Portrayal  of  the  natural  history  of  disease. 

Detection  of  epidemics. 

Documentation  of  the  distribution  and  spread  of  a  health  event. 

Facilitating  epidemiologic  and  laboratory  research. 

Testing  of  hypotheses. 

Evaluation  of  control  and  prevention  measures. 

Monitoring  of  changes  in  infectious  agents. 

Monitoring  of  isolation  activities. 

Detection  of  changes  in  health  practice. 

and  planning 


TABLE  II. 1.   Steps  in  planning  a  surveillance  system 

1.  Establish  objectives. 

2.  Develop  case  definitions. 

3.  Determine  data  source  or  data-collection  mechanism  (type  of  system) 

4.  Develop  data-collection  instruments. 

5.  Field  test  methods. 

6.  Develop  and  test  analytic  approach. 

7.  Develop  dissemination  mechanism. 

8.  Assure  use  of  analysis  and  interpretation. 


TABLE  II. 2.   Criteria  for  identifying  high-priority  health  events  for  surveillance 

•  Frequency : 

Incidence 

Prevalence 

Mortality 

Years  of  potential  life  lost 

•  Severity : 

Case-fatality  ratio 
Hospitalization  rate 
Disability 

•  Cost 

Direct  and  indirect  costs 

•  Preventability 

•  Communicability 

•  Public  interest 


TABLE  IV. 1.   Essential  questions  for  the  practice  of  effective  disease/injury  reporting 
Initiation/sources  of  reports 

*  How  and  by  whom  are  health-care  practitioners  (existing  and  newly  practicing) 
entered  into  the  reporting  network? 

*  By  what  agency  are  conditions  reported  for  such  temporary  residents  as  college 
students,  military  personnel,  and  migrant  workers? 

Routing/timing  of  reports 

*  How  should  'suspected  case,  laboratory  results  pending"  be  handled? 

*  Should  the  local  or  the  state  health  department  update  a  case  report  when 
additional  information  is  received? 

*  Should  case  reports  arise  from  the  health  jurisdiction  in  which  the  patient 
resides?  In  which  the  patient  became  infected  (injured)?  In  which  the  patient 
became  ill  (and/or  received  treatment)? 

*  Should  a  diagnostic  laboratory  send  data  on  reportable  conditions  to  the  requester, 
or  should  it  be  responsible  for  reporting  to  appropriate  local/state  health 
departments?   (If  "yes"  to  the  latter,  in  what  order?) 

*  If  a  case  occurs  one  calendar  year,  but  is  not  reported  until  early  in  the  next 
calendar  year,  what  is  the  year  of  report?  What  is  the  cut-off  date  for  reports 
from  the  previous  year?  How  are  reports  treated  that  are  for  the  previous  year  but 
are  received  after  the  established  deadline? 

Is  there  a  mechanism  for  reporting  disease/ injury  across  state  lines,  as 
appropriate? 

Policy  issues  in  reporting  disease/ injury 

What  items  on  the  reporting  form  must  be  completed  before  a  report  can  be 
forwarded? 

If  a  reportable  condition  has  a  specific  case  definition  (such  as  measles  and 
AIDS) ,  should  the  case  be  reported  before  confirmation  by  a  disease  investigator? 
(3) 

What  mechanism  will  be  (has  been)  established  to  deal  with  situations  in  which 
cases  must  be  reported  in  batches  rather  than  individually  because  the  number  of 
reports  is  overwhelmingly  large? 

*  If  case  reports  are  held  pending  laboratory  confirmation,  should  the  "date  of 
report'  reflect  the  original  date  of  report  or  the  date  laboratory  confirmation  was 
received  or  some  other  date  associated  with  this  health  event? 

*  Are  reports  generated  to  identify  records  with  incomplete/unconfirmed  data  so  that 
follow-up  can  be  initiated? 

*  How  does  one  avoid  duplicate  reports  of  the  same  case? 

How  are  discrepancies  in  the  information  on  duplicate  reports  resolved? 


TABLE  IV. 2.   Concerns  of  the  data-base  manager 

1.  Who  will  enter  the  data?  What  credentials  must  this  person  have?  Who  is  this 
person's  back-up?  Who  will  update  records?  Back-up  the  computer  file? 

2.  Will  data  be  entered  on  an  as-received  basis  or  according  to  an  established 
schedule? 

3.  Does  the  data-entry  screen  replicate  the  paper  form  from  which  data  are  to  be 
entered? 

4.  Does  the  data-entry  program  allow  for  certain  data  items  to  be  entered 
automatically  on  subsequent  screens  until  the  data  recorder  makes  a  change?  (For 
example,  the  county  initially  entered  will  appear  on  each  subsequent  screen  until 
the  recorder  types  in  a  different  county.  This  allows  the  recorder  to  batch 
records  for  more  efficient  entry)  . 

5.  Does  the  data-entry  program  effectively  validate  the  data  being  entered  for 
completeness  by  use  of  "must-enter'  fields  and  "look-up"  files? 

6.  Does  the-  data-entry  program  have  the  ability  to  do  range  checking  on  values 
entered?  If  so,  does  the  system  allow  for  acceptable  ranges  to  change,  reflecting 
values  entered  in  the  data  base  over  a  time?  Is  there  a  logic  audit  procedure  in 
the  system — to  locate  such  errors  as  misspelled  names  or  addresses,  incorrectly 
coded  race,  gender,  or  code  for  disease/injury? 

7.  At  what  level  (state  or  local)  will  records  be  changed  or  deleted?  Who  owns  the 
data  records? 

8.  If  the  data  base  is  distributed  to  other  users  as  an  electronic  file  or  on  floppy 
diskette,  are  there  safeguards  to  prevent  overwriting  another  user's  data? 
Safeguards  against  computer  viruses? 

9.  Are  the  data-entry  programs  flexible  enough  to  allow  variables  to  be  modified  as 
prescribed  by  changes  in  state  regulations  and  national  recommendations? 

10.  Are  production  reports  automatically  generated  for  quality  assurance  of  data  entry? 

11.  How  and  with  what  frequency  are  data  copied  and  stored  for  back-up  purposes?  Are 
paper/film  copies  maintained  (in  the  event  of  computer  failure)? 

12.  Are  double-entry  systems  used  for  quality  assurance? 


TABLE  V.l.  Rates  and  quantities  involving  rates  commonly  used  in  epidemiology 


Measure 


Numerator 


Denominator 


Expressed  per 
number  at  risk 


Measures  of  morbidity: 


Incidence 
rate 


Attack  rate 


Secondary 
attack  rate 


Point 
prevalence 


Number  of  new  cases 
of  specified 
condition/given  time 

Number  of  new  cases 
of  specified 
condition/epidemic 
period 

Number  of  new  cases 
of  specified 
condition  among 
contacts  of  known 
patients 

Number  of  current 
cases  of  specified 
condition  at  given 
time 


Population  at  start 
of  time  interval 


Population  at  start 
of  epidemic 
period 


Size  of  contact 
population  at  risk 


Estimated 
population  at 
same  point  in  time 


variable: 
10"  where 
x  =  2,3,4,5,6 

variable: 
101  where 
x  =  2,3,4,5,6 


variable: 
10"  where 

x  =  2,3,4,5,6 


variable: 
101  where 

x  =  2,3,4,5,6 


Period 
prevalence 


Number  of  old  cases 
plus  new  cases  of 
specified  condition 
identified  in  given 
time  interval 


Estimated  mid-interval 
population 


variable: 
101  where 
x  =  2,3,4,5,6 


Measures  of  mortality: 


Crude 
death  rate 

Total  number  of  deaths 
reported  in  given 
time  interval 

Estimated  mid-interval 
population 

1,000  or 
100,000 

Cause- 
specific 
death  rate 

Number  of  deaths  from 
specific  cause  in 
given  time  interval 

Estimated  mid-interval 
population 

100,000 

Proportionate 
mortality 

Number  of  deaths  from 
specific  cause  in 
given  time  interval 

Total  number  of  deaths 
from  all  causes  in 
same  interval 

100  or 
1,000 

Measure 


Numerator 


Measures  of  mortality:  (continued) 


Dealh-to- 
case  ratio 
(Case-fatality 
rate,  case- 
fatality  ratio) 


Neonatal 

mortality 

rate 

Infant 
mortality 

rate 

Maternal 

mortality 

rate 

Number  of  deaths  from 
specific  condition 
in  given  time 
interval 


Number  of  deaths 
(<28  days  of  age)  in 
given  time  interval 

Number  of  deaths 
(<1  year  of  age)  in 
given  time  interval 

Number  of  deaths  from 
pregnancy  related  causes 
in  given  time 
interval 


Measures  of  natality: 


Denominator 


Number  of  new  cases 
of  that  condition 
in  same  time 
interval 


Number  of  live  births 
in  same  time 
interval 

Number  of  live  births 
reported  in  same 
time  interval 

Number  of  live  births 
in  same  time 
interval 


Crude 
birth  rate 

Number  of  live  births 
reported  in  given 
time  interval 

Estimated  total 

mid-interval 

population 

Crude 
fertility  rate 

Number  of  live  births 
reported  in  given 
time  interval 

Estimated  number  of 
women  ages  15-44 
years  at  mid-interval 

Crude  rate 
of  natural 
increase 

Number  of  live  births 
minus  number  of  deaths 
in  given  time  interval 

Estimated  total 

mid-interval 

population 

Low  birth 
weight  ratio 

Number  of  live  births 
(<2,500  grams)  in 
given  time  interval 

Number  of  live  births 
reported  in  same 
time  interval 

Expressed  per 
number  at  risk 


100 


1,000 


1,000 


100,000 


1,000 


1,000 


1,000 


100 


TABLE  V.2.   Crude  death  rates-Dade  and  Pinellas  counties,  Florida,  1980 


Population 

Deaths 

Crude  death  rate 

(per  1,000 

population) 

Dade  County 

1,706,097 

16,859 

9.9 

Pinellas  County 

732,685 

11,531 

15.7 

Sources:     Bureau  of  the  Census,  1983. 

National  Center  for  Health  Statistics,  Centers  for  Disease  Control. 


TABLE  V3.   Age-specific  death  rates-Dade  and  Pinellas  counties,  Florida,  1980 

Age  group 

(years) 

Dade  County 

Pinellas  County 

Population 

Deaths 

Rate  (per 
1,000  pop.) 

Population 

Deaths 

Rate  (per 
1,000  pop.) 

0-4 

97,870 

383 

3.9 

31,005 

101 

3.3 

5-14 

221,452 

75 

0.3 

77,991 

20 

0.3 

15-24 

284.956 

440 

1.5 

95,456 

80 

0.8 

25-34 

265,885 

529 

2.0 

90,435 

129 

1.4 

35-44 

207^64 

538 

2.6 

65419 

168 

2.6 

45-54 

193^05 

1,107 

5.7 

69472 

460 

6.6 

55-64 

175479 

2,164 

12.3 

98,132 

1,198 

12.2 

65-74 

152,172 

3,789 

24.9 

114,686 

2,746 

23.9 

>75 

107,114 

7,834* 

73.1 

89,889 

6,629* 

73.7 

Total 

1,706,097 

16,859 

9.9 

732,685 

11431 

15.7 

Sources:     Burea 
Natio 

♦Deaths  >75  incl 

u  of  the  Census, 
aal  Center  for  Hi 

ude  six  persons 

1983. 
;alth  Statistics, 

of  unknown  ag 

Centers  for  Disease  Control. 

;  for  Dade  and  one  of  unknown  age  for  Pinellas  counties. 

TABLE  V.4.   Directly  standardized  death  rates-Dade  and  Pinellas  counties,  Florida,  1980* 


Age  group 
(years) 

(A) 

1980  U.S.  population 

(percentage  distribution) 

(B) 

Age-specific  death  rates 

(per  1,000  pop.) 

(C) 
Expected  deaths  in  1980 

U.S.  population  using 
county  age-specific  ratesf 

Dade  County 

Pinellas  County 

Dade  County 

Pinellas  County 

0-4 

7.2 

3.9 

3.3 

28 

24 

5-14 

15.3 

0.3 

0.3 

5 

5 

15-24 

18.7 

1.5 

0.8 

28 

15 

25-34 

16.5 

2.0 

1.4 

33 

23 

35-44 

11.4 

2.6 

2.6 

30 

30 

45-54 

10.0 

5.7 

6.6 

57 

66 

55-64 

9.6 

12.3 

12.2 

118 

117 

65-74 

6.9 

24.9 

23.9 

172 

165 

>75 

4.4 

73.1 

73.7 

322 

324 

Totals 

100.0 

9.9 

15.7 

793 

769 

Directly 
adjusted 

death 

rates  (per 

1,000  pop.)§ 

7.9                        7.7 

♦United  States  population,  1980,  used  as  standard. 

tCjj  =  AjxBjj  where  i=l,...,9  age  groups  and  j=l,2  counties. 

§2CS  /1 00. 


TABLE  V.5.  Indirectly  standardized  death  rates- 

Dade  and  Pinellas  counties,  Florida,  1980* 

; 

Age  group 
(years) 

(A) 

Death  rates 

(per  1,000  pop.) 

U.S.  1980 

(B) 
1980  population 

(C) 

Expected  number  of  deaths  in 

county  based  on  U.S. -specific 

ratesf 

Dade 

Pinellas 

Dade 

Pinellas 

0-4 

3.3 

97,870 

31,005 

323 

102 

5-14 

0.3 

221,452 

77,991 

66 

23 

15-24 

1.2 

284,956 

95,456 

342 

115 

25-34 

1.3 

265,885 

90,435 

346 

118 

35^4 

2.3 

207,564 

65,519 

477 

151 

45-54 

5.9 

193,505 

69,572 

1,142 

410 

55-64 

13.4 

175,579 

98,132 

2,353 

1,315 

65-74 

29.8 

152,172 

114,686 

4,535 

3,418 

>75 

87.2§ 

107,114 

89,889 

9,340 

7,838 

Totals 

8.8 

1,706,097 

732,685 

18,924 

13,490 

Expected 

death 

rales  (per 

1,000  pop.)H 

11.1 

18.4 

Adjusting 
factors** 

0.79 

0.48 

Crude 

death 

rates  (per 

1,000  pop.) 

9.9 

15.7 

Indirecdy 

adjusted 

death 

rates  (per 

1,000  pop.)tt 

7.8 

7.5 

♦United  States  age  spi 
tQj  =  AjXBjj  where  i= 
§Deaths  >75  include  '. 
EQj/2Bsforj=U. 

i             i 

♦♦U.S.  total  death  rau 

xrfic  death  rates,  1980,  usee 
1.....9  age  groups  and  j=l,2 
S68  of  unknown  age  for  Unit 

:/expected  death  rate. 

as  standard, 
xninties. 
ed  States. 

tfCrude  death  rate  x  adjusting  factor. 


TABLE  V.6.   Five-number  summary  of  39  4-week  totals  of  reported  cases  of  meningococcal  infections- 
United  States,  1987-1989 


Median  190 

Hinges  151  237 

Extremes  102  350 


TABLE  V.7.   Common  power  transformations  (y  ->  yp) 


Transformation 


2 
1 
\6 


y 


log(y) 


-Vi 


-l//y 


-1 

-2 


-1/y 
-1/y2 


Name 


Notes 


Higher  powers 


Square 
Raw 
Square  root 


No  transformation 
AppcrjEE  fcr  count  eta 


Logarithm 


Generally  logarithm  to 
base  10,  widely  used 


Reciprocal  root 

Reciprocal 
Reciprocal  square 


Minus  sign  preserves 
order 


Lower  powers 


TABLE  V.8.    Guide  for  selecting  data  graphics 


Type  of  graph  or  chart 
Arithmetic-scale  line  graph 
Semilogarithmic-scale  line  graph 

Histogram 

Frequency  polygon 

Cumulative  frequency 
Scatter  diagram 
Simple  bar  chart 
Grouped  bar  chart 
Stacked  bar  chart 

Deviation  bar  chart 
Pie  chart 
Spot  map 
ChJoropleth  map 
Box  plot 


When  to  use 

Trends  in  numbers  or  rates  over  tune 

1     Emphasize  rate  of  change  over  time 

2.   Display  values  ranging  >2  orders  of  magnitude 

1     Frequency  distribution  of  continuous  variable 

2.   Number  of  cases  during  epidemic  (i.e..  epidemic  curve)  or  over  time 

Frequency  distribution  of  continuous  variable,  especially  to  show 
components 

Cumulative  frequency 

Plot  association  between  two  variables 

Compare  size  or  frequency  of  different  categories  of  single  variable 

Compare  size  or  frequency  of  different  categories  of  2-4  series  of  data 

Compare  totals  and  illustrate  component  parts  of  the  total  among 
different  groups 

Illustrate  differences,  both  positive  and  negative,  from  baseline 

Show  components  of  a  whole 

Show  location  of  cases  or  events 

Display  events  or  rates  geographically 

Visualize  statistical  characteristics  (e.g..  median,  range,  skewness)  of 
variable 


TABLE  V.9.  Primary  and  secondary  morbidity  from  syphilis,  by  age  category-United  Stales, 


1989 


Age  group 
(years) 

Cases 

Number 

Percentage* 

<14 

230 

0.5 

15-19 

4,378 

10.0 

20-24 

10,405 

23.6 

25-29 

9,610 

21.8 

30-34 

8,648 

19.6 

35^4 

6,901 

15.7 

45-54 

2,631 

6.0 

>55 

1278 

2.9 

Total 

44,081 

100.0 

♦Percentages  do  not  add  to  100.0  due  to  rounding. 
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TABLE  VII. 1.   Controlling  and  directing  information  dissemination 


Steps 

Establish  communications 
message 

Define  audience 

Select  the  channel 

Market  the  message 

Evaluate  the  impact 


Questions  to  be  Answered 
What  should  be  said? 

To  whom  should  it  be  said? 

Through  what  communication 
medium? 

How  should  the  message  be 
stated? 

What  effect  did  the  message  create? 


TABLE  VIII. 1.   Sample  case  definition  developed  by  Che  Centers  for  Disease  Control  and 
the  U.S.  Council  of  State  and  Territorial  Epidemiologists 

Measles 

Clinical  case  definition 

An  illness  characterized  by  all  of  the  following  clinical  features: 

•  A  generalized  rash  lasting  _>3  days 

•  A  temperature  _>38.3  C  (101  F) 

•  Cough  or  coryza  or  conjunctivitis 

Laboratory  criteria  for  diagnosis 

•  Isolation  of  measles  virus  from  a  clinical  specimen 

or 

•  Significant  rise  in  measles  antibody  level  by  any  standard  serologic  assay 

or 

•  Positive  serologic  test  for  IgM  antibody  (to  measles) 

Case  classification 

Suspected:  any  rash  illness  with  fever. 

Probable:  meets  the  clinical  case  definition,  has  no  or  noncontributory  serologic 

or  virologic  testing,  and  is  not  epidemiologically  linked  to  a  probable  or 

confirmed  case. 

Confirmed:  a  case  that  is  laboratory  confirmed  or  that  meets  the  clinical  case 

definition  and  is  epidemiologically  linked  to  a  confirmed  or  probable  case.   A 

laboratory-confirmed  case  does  not  need  to  meet  the  clinical  case  definition. 

Comment 

Two  probable  cases  that  are  epidemiologically  linked  would  be  considered  confirmed, 
even  in  the  absence  of  laboratory  confirmation. 


TABLE  VIII. 2.   The  detection  of  health  conditions  with  a  surveillance  system. 

"Condition"  present 


Yes  No 


True        False 
Yes      positive     positive       A+B 


A  B 
Detected  by 

surveillance                      False  True 

No       negative  negative       C+D 

C  D 


A+C  B+D  TOTAL 


♦Sensitivity  =  A/ (A+C). 


TABLE  VIII. 3.  Comparison  of  estimated  costs  for  active  and  passive  surveillance  systems 
in  a  health  department,  Vermont,  June  1,  1980,  to  May  31,  1981 

Type  of  surveillance  system 


Paper 
Mailing 
Telephone 
Personnel 

Secretary 

Public  health  nurses 

TOTAL 

♦Active  =  Weekly  calls  from  health  department  to  request  reports, 
t Passive  =  Provider-initiated  reporting. 


Active* 

Passivet 

$  114 

185 

1,947 

$  80 

48 

175 

3,000 
14,025 

2,000 
0 

$19,271 

$2,203 

TABLE  VIII. 4.   Outline  of  sample  surveillance  evaluation  report 

1 .  Public  Health  Importance 

Describe  the  public  health  importance  of  the  health  event.  The  three  most 
important  categories  to  consider  are  the  following: 

•  Total  number  of  cases,  incidence,  and  prevalence. 

•  Indices  of  severity  such  as  the  mortcffiii£ycaa£-efatrs3Lity  ratio. 

•  Preventability. 

2  .   Objectives  and  Usefulness 

Explicitly  state  the  objectives  of  the  system  and  the  health  event (s)  being 
monitored  (case  definitions) .  Describe  the  actions  that  have  been  taken  as  a 
result  of  the  data  from  the  surveillance  system.  Describe  who  has  used  the  data 
to  make  decisions  and  take  actions.   List  other  anticipated  uses  of  the  data. 

3  .  System  Operation 

Describe  the  following:  the  population  under  surveillance,  the  period  of  time  of 
the  data  collection,  the  information  that  is  collected,  who  provides  the 
information,  how  the  information  transferred  and  how  often,  how  the  data  are 
analyzed  (by  whom  and  how  often) ,  how  often  reports  are  disseminated,  and  how 
reports  are  distributed  (to  whom  and  in  what  media) .  Include  an  assessment  of 
the  simplicity,  flexibility,  and  acceptability  of  the  system. 

4.  Quantitative  Attributes:   Include  assessments  of  the  sensitivity,  predictive 
value  positive,  representativeness,  and  timeliness  of  the  system. 

5.  Cost  of  Operating  the  Surveillance  System.    Estimate  direct  costs  and,  if 
possible,  assess  cost-benefit  issues. 

6 .  Conclusions  and  Recommendations 

These  should  state  whether  the  system  is  meeting  its  objectives  and  should 
address  issue  of  whether  to  continue  and/or  modify  the  surveillance  system. 
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TABLE  IX. 2.   An  ethical  checklist  for  public  health  surveillance 

1.  Justify  the  surveillance  system  in  terms  of  maximizing  potential  public  health 
benefits  and  minimizing  public  harm. 

2.  Justify  use  of  identifiers  and  the  maintenance  of  records  with  identifiers. 

3.  Have  surveillance  protocols  and  analytic  research  reviewed  by  colleagues,  and 
share  data  and  findings  with  colleagues  and  the  public  health  community  at  large. 

4.  Elicit  informed  consent  from  potential  surveillance  subjects. 

5.  Assure  the  protection  of  confidentiality  of  subjects. 

6.  Inform  health-care  providers  of  conditions  germane  to  their  patients. 

7.  Inform  the  public,  the  public  health  community,  and  clinicians  of  findings  of 
surveillance. 


TABLE  XII. 1.   Reasons  cited  by  physicians  for  failure  to  report  notifiable  diseases 
(42,45-47) 

1.  Assumed  that  the  case  would  be  reported  by  someone  else. 

2.  Unaware  that  disease  reporting  was  required. 

3.  Do  not  have  notifiable  disease  reporting  form/telephone  number. 

4.  Do  not  know  how  to  report  notifiable  diseases. 

5.  Do  not  have  copy  of  list  of  notifiable  diseases. 

6.  Concerned  about  confidentiality. 

7.  Concerned  about  violation  of  doctor-patient  relationship. 

8.  Reporting  is  too  time-consuming. 

9.  Absence  of  incentives  to  report. 


TABLE  XII. 2.  What  local  and  state  health  departments  can  do  to  improve  reporting  by 
physicians 

Local  health  departments 

•  Express  an  interest  in  disease  reporting  to  those  responsible  for  report- 
ing. 

•  Maximize  contact  with  the  local  medical  community. 

Presentations 
Mailings 
Newsletters 
Telephone  contact 
Mass  media 

•  Use  the  data. 

State  health  department a 

•  Express  an  interest  in  disease  reporting  to  those  responsible  for  report- 
ing. 

•  Maintain  a  reasonable  list  of  reportable  conditions. 

•  Maximize  contact  with  the  state  medical  community. 
-  -    Presentations 

Mailings 
Newsletters 
Telephone  contact 
Mass  media 

•  Use  the  data. 


TABLE  XII. 3.  Criteria  used  to  set  priorities  for  national  disease  surveillance, 
Canada  (60) 

1.  Surveillance  by  the  World  Health  Organization 

2.  Importance  to  agriculture  in  Canada 
3  .  Disease  incidence 

4.  Morbidity  (hospital  days  and  short-term  disability) 

5.  Mortality. 

6.  Case-fatality  ratio 

7.  Communicability 

8.  Potential  for  outbreaks 

9 .  Socioeconomic  impact 

10.  Public  perception  of  risk 

11.  Vaccine  preventability 

12.  Necessity  for  an  immediate  public  health  response 


TABLE  XII. 4.   Confidence  intervals  for  rates  (61). 
Let    r      =      rate  per  1,000 

n      =      denominator  upon  which  rate  is  based 

The  limits  of  the  95-percent  confidence  interval  are: 
upper  limit:  r  +  61.981    i   r  /  n 

lower  limit:  r  -  61.981   I   r  /  n 


TABLE  XII. 5.   Formula  for  calculating  the  95%  confidence  interval  for  the  ratio  of  two 
independent  rates  (61) 

Let    r2  =  rate  for  period  1  (or  area  1) 

dj  =  number  of  events  for  period  1  (or  area  1) 

r2  =  rate  for  period  2  (or  area  2) 

d2  =  number  of  events  for  period  2  (or  area  2) 

R  =  r,/r2 

The  limits  of  the  95%  confidence  interval  are: 
upper   limit:    R   +   1.96R  1     1/dj    +   l/d2 

lower  limit:    R  -   1.96R  I    l/dj   +  l/d2 


TABLE  XII. 6.   Formula  for  calculating  the  95%  confidence  interval  for  the  difference 
between  two  independent  rates 

Let    r,  =  rate  for  period  1  (or  area  1) 

nj  =  denominator  upon  which  r:  is  based 

r2  =  rate  for  period  2  (or  area  2) 

d2  =  denominator  upon  which  r2  is  based 

D  =  Ti  -  r2 

The  limits  of  the  95%  confidence  interval  are: 


upper 


limit:    D  +   61.981        K    ri/nx   +   r2/n2 

J 

lower  limit:    D  -   61.981        I    rj/nj  +  r2/n2 


Table  XIII. 1.   Examples  of  data  sources  for  surveillance  in 
developing  countries 

I.  Case  reports 

a.  from  health  stations  or  hospitals 

b.  from  sentinel  sites 

II.  Births  and  deaths 

a.  from  hospitals 

b.  from  sentinel  sites 

c.  complete  ascertainment 

III.  Laboratory  reports  (usually  from  hospitals) 

IV.  Sample  surveys  (particularly  cluster  surveys) 


Table  XIII. 2.   Health  problems  ranked  according  to  preventability  and 
treatability,  Thailand,  1987 


Rank 

Disease 

Total 

score 

(4-16)* 

Prevent- 
ability 
(H-M-L) ** 

Disease 

Total 
score 
(4-16) 

Treat- 
ability 
(H-M-L) 

1. 

Tetanus 

7 

H 

Malaria 

12 

H 

2. 

Poliomyelitis 

7 

H 

Pneumonia 

11 

H 

3. 

Measles 

7 

H 

Dengue 
(hemorrhagic) 

10 

H 

4. 

Diphtheria 

6 

H 

Acute  diarrhea 

10 

H 

5. 

Rabies 

6 

H 

Tuberculosis 

9 

H 

6. 

Rubella 

5 

H 

Veneral 
Disease 

9 

H 

7. 

Traffic 
injury 

16 

M 

Dysentery 

8 

H 

8. 

Stroke 

15 

M 

Conjunctivitis 

7 

H 

9. 

Malaria 

12 

M 

Influenza 

7 

H 

10. 

Peptic  Ulcer 

11 

M 

Measles 

7 

H 

Source:   "Review  of  the  Health  Situation  in  Thailand: 
Diseases. " 


Priority  Ranking  of 


*  Rated  on  a  scale  of  4  (low)  to  16  (high) 
**H=high,  M=medium,  L=low 


Table  XIII. 3.   Examples  of  objectives  linked  to  surveillance  components  that 
will  measure  objectives 


Surveillance- linked  objectives 
Objective 


Surveillance  component 
that  measures  objective 


Priority  area  #1--Diarrhea 

Health  status--Reduce  diarrhea  mortality  by 
25%  by  1995 

•  Risk  factor—Increase  female  literacy  of 
10-  to  14-year-olds  to  80%  by  1995 

•  Health  activity—Increase  to  90%  the 
proportion  of  0-  to  4-year-olds  given 
appropriate  home  fluids  by  1995 


Vital-event  registration 
in  five  sentinel  areas 

Regularly  conducted 
survey 

Regularly  conducted 
health  survey 
Local—exit  interviews 


Priority  area  #2 --Measles 

Health  status— Reduce  measles  mortality  by 
25%  by  1995 

•  Health  status—Reduce  number  of  reported 
measles  cases  by  50%  by  1995  compared  with 
1990 

•  Health  activity- -Increase  percentage  of  12- 
to  23-month-olds  with  one  dose  of  measles 
vaccine  to  90%  nationwide 

•  Health  activity— Increase  to  80%  the 
percentage  of  districts  with  one-dose 
measle  vaccination  coverage  of  12-  to  23- 
month-olds  of  90% 


Vital-event  registration 
in  five  sentinel  areas 

National  disease- 
reporting  system 


•  Regularly  conducted 
health  survey 


Exit  interviews  of 
mothers  of  50  12-  to  23- 
month-olds  at  all  health 
facilities  in  district 
twice  a  year 


Priority  area  #5--HIV/AIDS 

Health  status—Stabilize  at  10%  the 
proportion  of  20-  to  25-year-old  women  who 
have  babies  at  the  capital  city  hospital 
and  who  are  HIV-positive  by  1993 


Sentinel  HIV  testing  of 
20-  to  2 5 -year  old  women 
who  have  babies  in 
capital  city 


Health  status--No  increase  in  the  2%  HIV 
seroprevalence  of  rural  women  who  have 
babies  that  are  HIV-positive  by  1993 


Sentinel  HIV  testing  of 
women  having  babies  in 
capital  city 


Risk  factor—Reduction  of  HIV-risk  taking 
behavior  by  50%  in  1994  in  areas  with  HIV 
seroprevalence  of  STD  patients  >10%  (an 
indicator  of  entrance  of  HIV  into 
community) 


Reporting  of  clinical 
chancroid  through  the 
national  disease- 
reporting  system 

Laboratory- -Syphilis 
serology  testing  of  20- 
to  25-year-old  women 
having  babies  in 
affected  areas 


Exit  interviews  in 
affected  areas 


Health  activity--Increase  to  75%  the 
percentage  of  sexual  contacts  whose 
partners  are  not  spouses  who  also  use 
condoms  by  1995  in  areas  with  HIV 
seroprevalence  of  STD  patients  >10% 


Nationwide  only-- 
Regularly-conducted 
health  survey 

Exit  interviews  in 
affected  areas 

Nationwide  only-- 
Regularly-conducted 
health  survey 
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FIGURE  1.1.  Reported  cases  of  congenital  syphilis    among  infants  <1  year  of  age  and  rates 
of  primary  and  secondary  (P&S)  syphilis  among  women — United  States,  1970-1991 
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Note:  The  surveillance  case  definition  for  congenital  syphilis  changed  in  1989. 
Source:  Centers  for  Disease  Control. 
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FIGURE  1.2.   Salmonella  rates  in  New  Hampshire  and  contiguous  states,  by  county 


Cases  per  100,000 

.01  -  .80* 
.81-1.60 
1.61-3.20 

>3.20 


Unshaded  counties=no  cases  reported 


FIGURE  1.3.  Homicide  rate,  by  age  and  gender  of  victim,  United  States,  1986 
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FIGURE  1.4.  Malaria  rates,  by  year— United  States,  1930-1988 
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FIGURE  1.5.  Reported  cases  of  measles,  by  age  group,  United  States,  1980-1982* 
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*  Rates  estimated  by  extrapolating  age,  from  reported  case-patients  with  known  age. 


FIGURE  1.6.  Semi-logarithmic-scale  line  graph  of  reported  cases  of  paralytic 
poliomyelitis— United  States,  1951-1989 
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FIGURE  1.7.  Percentage  of  reported  cases  of  gonorrhea  caused 
by  antibiotic-resistant  strains— United  States,  1980-1990 
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FIGURE  1.8.  Cesarean  deliveries  as  a  percentage  of  all  deliveries  in  U.S.  hospitals, 
by  year,  1970-1990 


30 -f 
25 


§,  20 

CO 

-»— ' 

§    15 
o 


10 

5 


I        a        I        I 


=» S—TT  IT" 


1970  1972  1974  1976  1978  1980  1982  1984  1986  1988  1989  1990 

Year 


FIGURE  V.  1 .  Crude,  gender-specific  and  gender-race-specific 
cases  of  primary  and  secondary  syphilis — United 
States,  1981-1990,  comparison  of  differential  trends 
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FIGURE  V.2.  Dot  plot  of  results  of  swine  influenza  virus  (SIV) 
hemagglutination-inhibition  (HI)  antibody  testing  among 
exposed  and  unexposed  swine  exhibitors — Wisconsin,  1988 
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FIGURE  V.3.  Ordered  data  series  and  stem-and-leaf  display  of  39  4-week  totals  of  reported  cases 
of  meningococcal  infections-United  States,  1987-1989 

1987:   226,  307,  350,  236,  222,  258,  197,  167,  138,  108,  191,  190,  201 

1988:   216,  238,  331,  270,  265,  156,  164,  142,  112,  1 11,  153,  138,  159 

1989:    145,  306,  314,  264,  222,  195,  155,  149,  102,  117,  174,  158,  159 
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In  this  example  the  first  two  digits  of  each  datum  serve  as  the  stem  and  the  third  digit  serves  as  a 
leaf,  e.g.,  for  the  numbers  264  and  265,  the  stem  and  leaves  appear  as  26  (stem)  and  45  (leaves). 
Since  further  division  of  the  stems  would  result  in  an  attenuated  distributional  shape,  each  stem 
represents  a  range  of  20  numbers,  e.g.,  the  stem  26  represents  any  number  from  260  to  279  so  that 
for  the  number  270,  the  stem  and  leaf  appear  as  26  (stem)  and  0  (leaf). 


FIGURE  V.4.  Scatter  plot  of  39  4-week  totals  of  reported  cases  of 
meningococcal  infections — United  States,  1987-1989 
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FIGURE  V.5.    Box  plot  of  39  4-week  totals  of  reported  cases  of 

meningococcal  infections-United  States,  1987-1989 
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FIGURE  V.6.  Histogram  (epidemic  curve)  of  reported  cases  of 
paralytic  poliomyelitis — Oman,  January  1988-March  1989 
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FIGURE  V.7.  Sample  cumulative  attack  rate,  by  grade  in 
school  and  time  of  onset  —  North  Carolina,  1985 
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FIGURE  V.8.  Survival  curves  over  time,  based  on  serum 
testosterone  level,  Eastern  Cooperative  Oncology  Group 
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FIGURE  V.9.  Frequency  polygon  of  reported 
cases  of  encephalitis — United  States,  1965 
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FIGURE  V.10.  Group  bar  chart  of  case-fatality  rates  from  ectopic 
pregnancy,  by  age  group  and  race — United  States,  1970-1987 
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FIGURE  V.  1 1 .  Stacked  bar  chart  of  underlying  causes  of  infant  mortality, 
by  racial/ethnic  group  and  age  at  death — United  States,  1983 
15 
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FIGURE  V.12.  Deviation  bar  chart  of  notifiable  disease  reports,  comparison 

of  4-week  totals  ending  May  23, 1992,  with  historical  data — United  States 

Cases  current 
Disease  Decrease  Increase  4  weeks 

Aseptic  meningitis  • 
Encephalitis  (primary)  ■ 
Hepatitis  A 
Hepatitis  B 
Hepatitis,  non-A,  non-B  ■ 
Hepatitis  (unspecified) 
Legionellosis  ■ 
Malaria  ■ 
Measles  (total) 
Meningococcal  infections  ■ 
Mumps  ■ 
Pertussis  ■ 
Rabies  (animal)  ■ 
Rubella  ■ 

0.125        .25  .5  1 

Ratio  (log  scale)* 

0    Beyond  historical  limits. 

*  Ratio  of  current  4-week  total  to  the  mean  of  15  4-week  totals  (from  previous,  comparable,  and  subsequent 
4-week  periods  for  the  past  5  years).  The  point  where  the  hatched  area  begins  is  based  on  the  mean  and 
two  standard  deviations  of  these  4-week  totals. 
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FIGURE  V.13.  Pie  charts  of  poliomyelitis  vaccination  status  of  children  ages  1-4  years 
in  cities  with  populations  ^250,000,  by  financial  status — United  States,  1969 
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Adequately  vaccinated:  3+  doses  inactivated  poliovirus  vaccine  (IPV)  and/or 
3  doses  oral  poliovirus  vaccine  (OPV). 

Inadequately  vaccinated:  Some  poliovirus  vaccine,  but  <  3  doses  of  IPV 
and/or  <  3  doses  of  OPV. 

Not  vaccinated:  No  vaccine  given. 


FIGURE  V.14.  Spot  map  of  deaths  from  smallpox— California,  1915-1924 


V \ ' 

• 

• 

(               1   ■   V  •  •  •  • 

• 

•         • 

• 

7      •           • 

•  • 
••• 

FIGURE  V.  1 5.  Chloropleth  map  of  confirmed  and  presumptive  cases  of 
St.  Louis  encephalitis,  by  county — Florida,  1990* 
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FIGURE  V.16.  Density-equalizing  map  of  California  (based  upon 
population  density),  depicting  deaths  from  smallpox,  1915-1924 


FIGURE  VI.l.  Example:  Data  used  for  report  published  during  week  20  (May  23, 1992) 
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*  For  example,  XQ  is  total  of  cases  reported  for  weeks  1 6-1 9, 1 992. 


FIGURE  Vm.l.  National  Notifiable  Diseases  Surveillance  System 
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FIGURE  Vin.2.  Biases  in  surveillance 
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FIGURE  XII.  1.  Cartoon  depicting  mumps  as  a  public  health  problem,  Tennessee 
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FIGURE  XIII.A.l.  Percentage  of  case-patients  vaccinated  (PCV)  per  percentage 
of  population  vaccinated  (PPV)  for  seven  values  of  vaccine  efficacy  (VE) 


100 
90  - 

PCV 

PPV-(PPV  X  VE) 

l100 

■90 

1-(PPVXVE) 

oU 

/U    " 

t\J 

£fi    . 

-en 

> 

o    rc\  - 

iccine 

*90 J95'  - 

■50     O 

a.    &0 

^Tiicacy  -     *n 

J/OU' bur  ' uf  ou# 

AH    - 

./in 

/I/ 

4U 
•  on 

oU 

9n  . 

•  OC\ 

in  - 

.  1  n 

IU 

0 

c 

)     1 

0 
0 

0    20    30    40    50    60    70    80 

90    10 

PPV 

u  p 

°  b 

i*5    CD 

a  8 

7-*  a 

o  jd 

Ih   £h 
*J     > 

CD   jj 

«  « 

a;  u 
o  a) 

CO    <* 

CD  ^ 

°  -J I 

•a  *d 

S3  | 

Z     Ih 


■d 

CD 
> 


CD 

• 

U 

CD 

CD 

> 

J-i 

S3 

CO 

u 

aS 

,CD 

£ 

CD 

a 

CD 

-a 

CO 

•fH 

•^  a 

A  I       *H 


CD 


CD 


«+h    CD 

O  o 

H 

wo 

CD   2 

a-d 

co    C 


Reproduced  by  NTIS 

National  Technical  Information  Service 
U.S.  Department  of  Commerce 
Springfield,   VA  22161 


This  report  was  printed  specifically  for  your 
order  from  our  collection  of  more  than  2  million 
technical  reports. 


For  economy  and  efficiency,  NTIS  does  not  maintain  stock  of  its  vast 
collection  of  technical  reports.   Rather,  most  documents  are  printed  for 
each  order.  Your  copy  is  the  best  possible  reproduction  available  from 
our  master  archive.  If  you  have  any  questions  concerning  this  document 
or  any  order  you  placed  with  NTIS,  please  call  our  Customer  Services 
Department  at  (703)487-4660. 

Always  think  of  NTIS  when  you  want: 

•  Access  to  the  technical,  scientific,  and  engineering  results  generated 
by  the  ongoing  multibillion  dollar  R&D  program  of  the  U.S.  Government. 

•  R&D  results  from  Japan,  West  Germany,  Great  Britain,  and  some  20 
other  countries,  most  of  it  reported  in  English. 

NTIS  also  operates  two  centers  that  can  provide  you  with  valuable 
information: 

•  The  Federal  Computer  Products  Center  -  offers  software  and 
datafiles  produced  by  Federal  agencies. 

•  The  Center  for  the  Utilization  of  Federal  Technology  -  gives  you 
access  to  the  best  of  Federal  technologies  and  laboratory  resources. 

For  more  information  about  NTIS,  send  for  our  FREE  NTIS  Products 
and  Services  Catalog  which  describes  how  you  can  access  this  U.S.   and 
foreign  Government  technology.  Call  (703)487-4650  or  send  this 
sheet  to  NTIS,  U.S.  Department  of  Commerce,  Springfield,  VA  221 61. 
Ask  for  catalog,  PR-827. 


Name 


Address . 


Telephone 


-  Your  Source  to  U.S.  and  Foreign  Government 
Research  and  Technology. 


PARKLAWN  HEALTH   LIBRARY 

WA  105  P9355  1992 


Principles  and  practice  of 
public  health  surveillance 


\u  fc  n 


rv  i  ) 


\'  ■   jjn 


Parklawn  Health  Library 
U.S.  Public  Health  Service 
Parklawn  Bldg.  -  Rm.13-12 
5600  Fishers  Lane 
Rockville,  Maryland  20857 


m  UBRA«y 


PABKLAWN  HEALTH  LIBRARY 


3    203 


00034019   7 


